What is JAR hell? (Or is it classpath hell? Or dependency hell?) And which aspects are still relevant when considering modern development tools like Maven or OSGi?
Interestingly enough there seems to be no structured answer to these questions (i.e. even the second page listed no promising headlines). This post is supposed to fill that gap. It starts with a list of problems that make up JAR hell, momentarily ignoring build tools and component systems. We will come back to them for the second part when we assess the current state of affairs.
JAR Hell is an endearing term referring to the problems that arise from the characteristics of Java's class loading mechanism. Some of them build on one another; others are independent.
A JAR cannot express which other JARs it depends on in a way that the JVM will understand. An external entity is required to identify and fulfill the dependencies. Developers would have to do this manually by reading the documentation, finding the correct projects, downloading the JARs and adding them to the project. Optional dependencies, where a JAR might only require another JAR if the developer wants to use certain features, further complicate the process.
The runtime will not detect unfulfilled dependencies until it needs to access them.
This will lead to a
NoClassDefFoundError crashing the running application.
For an application to work it might only need a handful of libraries. Each of those in turn might need a handful of other libraries, and so on. As the problem of unexpressed dependencies is compounded it becomes exponentially more labor-intensive and error-prone.
Sometimes different JARs on the classpath contain classes with the same fully-qualified name. This can happen for different reasons, e.g. when there are two different versions of the same library, when a fat JAR contains dependencies that are also pulled in as standalone JARs, or when a library is renamed and unknowingly added to the classpath twice.
Since classes will be loaded from the first JAR on the classpath to contain them, that variant will "shadow" all others and make them unavailable.
If the variants differ semantically, this can lead to anything from too-subtle-to-notice-misbehavior to havoc-wreaking-errors. Even worse, the form in which this problem manifests itself can seem non-deterministic. It depends on the order in which the JARs are searched. This may well differ across different environments, for example between a developer's IDE and the production machine where the code will eventually run.
This problem arises when two required libraries depend on different, non-compatible versions of a third library.
If both versions are present on the classpath, the behavior will be unpredictable. First, because of shadowing, classes that exist in both versions will only be loaded from one of them. Worse, if a class that exists in one but not the other is accessed, that class will be loaded as well. Code calling into the library might hence find a mix of both versions.
Since non-compatible versions are required, the program will most likely not function correctly if one of them is missing.
Again, this can manifests itself as unexpected behavior or as
▚Complex Class Loading
By default all application classes are loaded by the same class loader but developers are free to add additional class loaders.
This is typically done by containers like component systems and web servers. Ideally this implicit use is completely hidden from application developers but, as we know, all abstractions are leaky. In some circumstances developers might explicitly add class loaders to implement features, for example to allow their users to extend the application by loading new classes, or to be able to use conflicting versions of the same dependency.
Regardless of how multiple class loaders enter the picture, they can quickly lead to a complex mechanism that shows unexpected and hard to understand behavior.
▚Classpath Hell and Dependency Hell
Classpath hell and JAR hell are essentially the same thing, although the latter seems to focus a little more on the problems arising from complex class loader hierarchies. Both terms are specific to Java and the JVM.
Dependency hell, on the other hand, is a more widely used term. It describes general problems with software packages and their dependencies and applies to operating systems as well as to individual development ecosystems. Given its universality it does not cover problems specific to single systems.
From the list above it includes transitive and maybe unexpressed dependencies as well as version conflicts. Class loading and shadowing are Java specific mechanics, which would not be covered by dependency hell.
▚State Of Affairs
Looking over the list of problems we see how build tools help with some of them. They excel in making dependencies explicit so that they can hunt down each required JAR along the myriad edges of the transitive dependency tree. This largely solves the problems of unexpressed and transitive dependencies.
But Maven et al. do nothing much about shadowing. While they generally work towards reducing duplicate classes, they can not prevent them. Build tools do also not help with version conflicts except to point them out. And since class loading is a runtime construct they do not touch on it either.
This comes with additional complexity, though, and often requires the developer to take a deeper dive into class loader mechanics. Ironically, also a point on the list above.
But regardless of whether or not component systems indeed considerably ease the pain of JAR hell, I am under the impression that a vast majority of projects does not employ them. Under this assumption said vast majority still suffers from classpath-related problems.
▚Where does this leave us?
Because they are not widely used, component systems leave the big picture untouched. But the ubiquity of build tools considerably changed the severity of the different circles of JAR hell.
No build tool supported project I partook in or heard of spent a mentionable amount of time dealing with problems from unexpressed or transitive dependencies. Shadowing rears its ugly head every now and then and requires a varying amount of time to be solved - but it always eventually is.
But every project sooner or later fought with dependencies on conflicting versions and had to make some hard decisions to work these problems out. Usually some desired update had to be postponed because it would force other updates that could currently not be performed.
Version conflicts are the single most problematic aspect of JAR hell.
I'd venture to say that for most applications, services, and libraries of decent size, version conflicts are one of the main deciding factors for when and how dependencies are updated. I find this intolerable.
I have too little experience with non-trivial class loader hierarchies to asses how much of a recurring problem they are. But given the fact that none of the projects I have worked on so far required them, I'd venture to say that they are not commonplace. Searching the net for reasons to use them often turns up what we already discussed: dependencies resulting in conflicting versions.
So based on my experience I'd say that conflicting versions are the single most problematic aspect of JAR hell.
We have discussed the constituents of JAR hell:
- unexpressed dependencies
- transitive dependencies
- version conflicts
- complex class loading
Based on what build tools and component systems bring to the game and how widely they are used we concluded that unexpressed and transitive dependencies are largely solved, shadowing at least eased and complex class loading not commonplace.
This leaves version conflicts as the most problematic aspect of JAR hell, influencing everyday update decisions in most projects.
In my next post I will discuss how Jigsaw addresses these issues. If you are interested, you can follow me. And if you liked this post. why not share it with your friends and followers?