Historically reflection could be used to break into any code that ran in the same JVM. With Java 9 this is going to change. One of the two main goals of the new module system is strong encapsulation; giving modules a safe space into which no code can intrude. These two techniques are clearly at odds so how can this stand off be resolved? After considerable discussions it looks like the recent proposal of open modules would show a way out.
If you're all down with the module system and what reflection does, you can skip the following back story and jump right into the stand off.
▚Setting the Scene
Let me set the scene of how the module system implements strong encapsulation and how that clashes with reflection.
▚Module Crash Course
The Java Platform Module Saloon (JPMS) is introducing the concept of modules, which in the end are just regular JARs with a module descriptor.
The descriptor is compiled from a module-info.java
file that defines a module's name, its dependencies on other modules, and the packages it makes available:
module some.module {
requires some.other.module;
requires yet.another.module;
exports some.module.package;
exports some.module.other.package;
}
In the context of encapsulation there are two points to take note of:
- Only public types, methods, and fields in exported packages are accessible.
- They are only accessible to modules that require the exporting module.
Here, "being accessible" means that code can be compiled against such elements and that the JVM will allow accessing them at run time. So if code in module a user depends on code in a module owner, all we need to do to make that work is have user require owner and have owner export the packages containing the required types:
module user {
requires owner;
}
module owner {
exports owner.api.package;
}
This is the common case and apart from making the dependencies and API explicit and known to the module system all works as we're used to.
So far everybody's having fun! Then, in comes Reflection... conversations halt mid-sentence, the piano player stops his tune.
▚Reflection
Before Java 9, reflection was allowed to break into any code.
Aside from some pesky calls to setAccessible
every type, method, or field in any class could be made available, could be called, could be changed - hell, even final fields were not safe!
Integer two = 2;
Field value = Integer.class.getDeclaredField("value");
value.setAccessible(true);
value.set(two, 3);
if (1 + 1 != two)
System.out.println("Math died!");
This power drives all kinds of frameworks - starting with JPA providers like Hibernate, coming by testing libraries like JUnit and TestNG, to dependency injectors like Guice, and ending with obsessed class path scanners like Spring - which reflect over our application or test code to work their magic.
On the other side we have libraries that need something from the JDK that it would rather not expose (did anybody say sun.misc.Unsafe
?).
Here as well, reflection was the answer.
So this guy, being used to getting what he wants, now walks into the Module Saloon and the bartender has to tell him no, not this time.
▚The Stand Off
Inside the module system (let's drop the saloon, I think you got the joke) reflection could only ever access code in exported packages. Packages internal to a module were off limits and this already caused quite a ruckus. But it still allowed to use reflection to access everything else in an exported package, like package-visible classes or private fields and methods - this was called deep reflection. In September rules got even stricter! Now deep reflection was forbidden as well and reflection was no more powerful than the statically typed code we would write otherwise: Only public types, methods, and fields in exported packages were accessible.
All if this caused a lot of discussions of course - some heated, some amicable but all with the sense of utter importance.
Some (myself included) favor strong encapsulation, arguing that modules need a safe space in which they can organize their internals without the risk of other code easily depending on it.
Examples I like to give are JUnit 4, where one big reason for the rewrite was that tools depended on implementation details, reflecting down to the level of private fields; and Unsafe
, whose pending removal put a lot of pressure on a lot of libraries.
Others argue that the flexibility provided by reflection not only enables great usability for the many frameworks relying on it, where annotating some entities and dropping hibernate.jar onto the class path suffices to make things work.
It also gives freedom to library users, who can use their dependencies the way they want, which might not always be the way the maintainers intended to.
Here, Unsafe
comes in as an example for the other side: Many libraries and frameworks that are now critical to the Java ecosystem were only feasible exactly because some hacks were possible without the JDK team's approval.
Even though I tend towards encapsulation, I see the other arguments' validity as well. So what to do? What choices to developers have besides encapsulating their internals and giving up on reflection?
▚Choice of Weapons
So let's say we are in a position where we need to make a module's internals available via reflection. Maybe to expose it to a library or framework module or maybe because we are that other module and want to break into the first one. In the rest of the article we'll explore all available choices, looking for answers to these questions:
- What privileges do we need to employ that approach?
- Who can access the internals?
- Can they be accessed at compile time as well?
For this exploration we will create two modules.
One is called owner and contains a single class Owner
(in the package owner
) with one method per visibility that does nothing.
The other, intruder, contains a class Intruder
that has no compile time dependency on Owner
but tries to call its methods via reflection.
Its code comes down to this:
Class<?> owner = Class.forName("owner.Owner");
Method owned = owner.getDeclaredMethod(methodName);
owned.setAccessible(true);
owned.invoke(null);
The call to setAccessible
is the critical part here, it succeeds or fails depending on how we decide to create and execute our modules.
In the end we get output as follows:
public: ✓ protected: ✗ default: ✗ private: ✗
(Here only the public method could be accessed.)
All the code I'm using here can be found in a GitHub repository, including Linux scripts that run it for you.
▚Regular Exports
This is the vanilla approach to expose an API: The module owner simply exports the package owner
.
To do this we need of course be able to change the owning module's descriptor.
module owner {
exports owner;
}
With this we get the following result:
public: ✓ protected: ✗ default: ✗ private: ✗
So far, err... not so good. First of all, we only reached part of our goal because the intruding module can only access public elements. And if we do it this way all modules that depend on owner can compile code against it and all modules can reflect over its internals. Actually, they are no longer internals at all since we properly exported them - the package's public types are now baked into the module's API.
▚Qualified Exports
If exports are vanilla, this is cranberry vanilla - a default choice with an interesting twist. The owning module can export a package to a specific module with what is called a qualified export:
module owner {
exports owner to intruder;
}
But the result is the same as with regular exports - the intruding module can only access public elements:
public: ✓ protected: ✗ default: ✗ private: ✗
Again, we reached only part of our goal, and again, we exposed the elements at compile time as well as at run time. The situation improved in the sense that only the named module, intruder in this case, is granted access but for that we accepted the necessity to actually know the module's name at compile time.
Knowing the intruding module might be amenable in the case of frameworks like Guice but as soon as the implementation hides behind an API (what the JDK team calls an abstract reflective framework; think JPA and Hibernate) this approach fails. Independently of whether it works or not, explicitly naming the intruding module in the owning module's descriptor can be seen as iffy. On the other hand, chances are the owning module already depends on the intruding one anyways because it needs some annotations or something, in which case we're not making things much worse.
▚Open Packages
Now it gets interesting. A pretty recent addition to the module system is the ability for modules to open up packages at run time only.
module owner {
opens owner;
}
Yielding:
public: ✓ protected: ✓ default: ✓ private: ✓
Neat! We killed two birds with one stone:
- The intruding module gained deep access to the whole package, allowing it to use even private elements.
- This exposure exists at run time only, so code can not be compiled against the package's content.
There is one downside, though: All modules can reflect over the opened package, now. Still, all in all much better than exports.
▚Qualified Open Packages
As with exports and qualified exports, there exists a qualified variant of open packages as well:
module owner {
opens owner to intruder;
}
Running the program we get the same result as before but now only intruder can achieve them:
public: ✓ protected: ✓ default: ✓ private: ✓
This presents us with the same trade-off as between exports and qualified exports and also doesn't work for a separation between API and implementation. But there's hope!
In November Mark Reinhold proposed a mechanism that would allow code in the module to which a package was opened up to transfer that access to a third module. Coming back to JPA and Hibernate this solves that problem exactly. Assume the following module descriptor for owner:
module owner {
// the JPA module is called java.persistence
opens owner to java.persistence;
}
In this case the mechanism could be employed as follows (quoted almost verbatim from the proposal):
A JPA entity manager is created via one of the
Persistence::createEntityManagerFactory
methods, which locate and initialize a suitable persistence provider, say Hibernate. As part of that process they can use theaddOpens
method on the client module owner to open theowner
package to the Hibernate module. This will work since theowner
module opens that package to thejava.persistence
module.
There is also a variant for containers to open packages to implementations. In the current EA build (b146) this feature does not seem to be implemented yet, though, so I couldn't try it out. But it definitely looks promising!
▚Open Modules
If open packages were a scalpel, open modules are a cleaver.
With it a module relinquishes any control over who accesses what at run time and opens up all packages to everybody as if there were an opens
clause for each of them.
open module owner { }
This results in the same access as individually opened packages:
public: ✓ protected: ✓ default: ✓ private: ✓
Open modules can be considered an intermediate step on the migration path from JARs on the class path to full-blown, strongly encapsulating modules.
▚Class Path Trickery
Now we're entering less modular ground.
As you might know java
and javac
require modules to be on the module path, which is like the class path but for modules.
But the class path is not going away and neither are JARs.
There are two tricks we can employ if we have access to the launching command line and can push the artifact around (so this won't work for JDK modules).
▚Unnamed Module
First, we can drop the owning module onto the class path.
How does the module system react to that? Since everything needs to be a module the module system simply creates one, the unnamed module, and puts everything in it that it finds on the class path. Inside the unnamed module everything is much like it is today and JAR hell continues to exist. Because the unnamed module is synthetic, the JPMS has no idea what it might export so it simply exports everything - at compile and at run time.
If any JAR on the class path should accidentally contain a module descriptor, this mechanism will simply ignore it. Hence, the owning module gets demoted to a regular JAR and its code ends up in a module that exports everything:
public: ✓ protected: ✓ default: ✓ private: ✓
Ta-da! And without touching the owning module, so we can do this to modules we have no control over. Small caveat: We can not require the unnamed module so there is no good way to compile against the code in the owning module from other modules. Well, maybe the caveat is not so small after all...
▚Automatic Module
The second approach is to strip the owning module of its descriptor and still put it on the module path. For each regular JAR on the module path the JPMS creates a new module, names it automatically based on the file name, and exports all its contents. Since all is exported, we get the same result as with the unnamed module:
public: ✓ protected: ✓ default: ✓ private: ✓
Nice. The central advantage of automatic modules over the unnamed module is that modules can require it, so the rest of the application can still depend on and compile against it, while the intruder can use reflection to access its internals.
One downside is that the module's internals become available at run time to every other module in the system. Unfortunately, the same is true at compile time unless we manage to compile against the proper owning module and then rip out its descriptor on the way to the launch pad. This is iffy, tricky, and error-prone.
▚Command Line Escape Hatches
Since we're fiddling with the command line anyway, there is a cleaner approach (maybe I should've told you about it earlier): Both javac
and java
come with a new flag --add-opens
, which opens additional packages.
java
--module-path mods
--add-modules owner
--add-opens owner/owner=intruder
--module intruder
This works without changing the owning module and applies to JDK modules as well. So yeah, much better than the unnamed and automatic module hacks.
▚Encapsulation Kill Switch
If all of this seems too cumbersome and you just want to get it working for your class path application, --permit-illegal-access
, amicably dubbed the encapsulation "kill switch", is the right command line argument for you.
With it, all code in the unnamed module can access other types regardless of any limitations that strong encapsulation imposes.
In exchange you get warnings on illegal accesses and the stifling feeling of living on burrowed time: The option will only exist in Java 9 to ease migration, so you just bought yourself two, three years.
Here's how to launch with owner on the module and intruder on the class path:
java
--class-path mods/intruder.jar
--module-path mods/owner.jar
--add-modules owner
--permit-illegal-access
intruder.Intruder
And here's the result - note how the warnings screw with my carefully created output:
WARNING: --permit-illegal-access will be removed in the next major release
public:WARNING: Illegal access by intruder.Intruder (file:mods/intruder.jar) to method owner.Owner.publicMethod() (permitted by --permit-illegal-access)
✓ protected:WARNING: Illegal access by intruder.Intruder (file:mods/intruder.jar) to method owner.Owner.protectedMethod() (permitted by --permit-illegal-access)
✓ default:WARNING: Illegal access by intruder.Intruder (file:mods/intruder.jar) to method owner.Owner.defaultMethod() (permitted by --permit-illegal-access)
✓ private:WARNING: Illegal access by intruder.Intruder (file:mods/intruder.jar) to method owner.Owner.privateMethod() (permitted by --permit-illegal-access)
✓
As I've said, this only works if the intruder comes from the unnamed module.
The rationale behind that is that properly modularized code, meaning code that lives in modules, should not need such hacks.
Indeed, if we put both artifacts on the module path, we can see that the flag is not exercised (no warnings beyond the initial one) and access fails because the internals of owner
are strongly encapsulated:
java
--module-path mods
--add-modules owner
--permit-illegal-access
--module intruder
WARNING: --permit-illegal-access will be removed in the next major release
public: ✗ protected: ✗ default: ✗ private: ✗
▚Summary
Ookey, still remember everything we did? No? Executive summary table to the rescue!
mechanism | access | compile access | reflection access | comments |
---|---|---|---|---|
export | descriptor | all code ~> public | all code ~> public | makes API public |
qualified export | descriptor | specified modules ~> public | specified modules ~> public | need to know intruding modules |
open package | descriptor | none | all code ~> private | |
qualified open package | descriptor | none | specified modules ~> private | can be transfered to implementation modules |
open module | descriptor | none | all code ~> private | one keyword to open all packages |
unnamed module | command line | all non-modules ~> public | all code ~> private | |
automatic module | command line and artifact | all code ~> public | all code ~> private | requires fiddling with the artifact |
command line flag | command line | none | all code ~> private |
Wow, we really went through quite a number of options! But now you know what to do if you're faced with the task to break into a module with reflection. In summary, I think the vast majority of use cases can be covered by answering one question:
Is it your own module?
- Yes ⇝ Open packages (maybe qualified) or, if there are too many, the entire module.
- No ⇝ Use the command line flag
--add-opens
.