Java Planet: Modifying the CLASSPATH at run time

Here's a question that comes my way occasionally: How can you change the search path for class loading at run time?

For example, let's say I have an application that reads the name of a JAR file from an external source, and then needs to add that JAR to the classpath so that it can load classes from it. This is something that's more likely to come up in a server environment, where you need the server to be able to add plug-in classes dynamically. For example, application servers like Tomcat need to be able to unpack a WAR file when requested; after unpacking there will be a 'classes' directory and a 'lib' directory full of JARs, all of which have to be added to the loading path so that the application can be started.

The solution to this problem requires an understanding of how the ClassLoader heirarchy works, so I'm going to cover that in some detail first.

ClassLoaders

The JVM includes a loader, usually referred to as the Boot loader. The default search path that this loader uses includes the Java runtime classes - java.lang, java.util, etc.

When the JVM is started it creates a ClassLoader object (loaded by the boot loader), usually referred to as the Extension ClassLoader. Its search path includes several JAR files found in the JVM's jre/lib/ext/ directory.

Then, the System loader is created. The search path for this loader is initialized from the CLASSPATH environment variable or from the value passed as the -cp option on the command line. (The label 'System' is a bit confusing; personally I think 'Application' class loader would be a more accurate and descriptive name.)

ClassLoaders are arranged in a heirarchy; each loader has a parent loader. The extension loader is the parent of the system loader; the parent of the extension loader is usually set as null. The boot loader is something of an exception in this respect - it's technically the parent of the extension loader but because it's part of the native implementation of the JVM and hence not a Class in the usual sense, it normally can't be accessed as a Java object.

When the JVM recognizes that it needs to load a new class, it calls a loader to do that. The loader it chooses is the same loader that loaded the class where the new class is first referenced at run time - that means that by default, when your application code first references a class that hasn't yet been loaded, it will call the system loader (the one that loaded your application classes).

The first thing the loader does is to delegate the request to its parent if it has one. The result of this is that all requests for new class references get delegated all the way up to the boot loader. So, if your code has requested a class that's to be found in the Java runtime, such as java.util.Map or java.text.Format, the boot loader will find and load the class.

If the loader can't locate the class it tells the caller - so if the class you requested is not in the boot loader path, it tells the extension loader that called it so. If the class isn't in any of the extension JARs, it gets passed back to the system loader. The system loader then attempts to find the class and in the case of your application classes, this would be where those get resolved. (Of course, if the request makes it all the way back down the heirarchy without the class being found, you'll get a ClassNotFoundException.)

To expand slightly: when a loader is called to load a class, this is the sequence of actions:

1 - delegate to the parent loader if there is one. If the parent finds the class, the loader returns to its caller at this point.

2 - if the parent doesn't find the class, or if there is no parent, the loader checks its local data to see if it already loaded the class. If it finds it, the loader returns at this point.

3 - if the class definition isn't found in the local data, the loader attempts to find the class definition in its search path. If the class definition is found in the path, the class is loaded and added to the loader's local data, and the loader returns.

4 - if this point is reached, the class hasn't been found - the loader returns control to its caller indicating such.

The Answer

Back to the original question: How to add more places to the search? The way to do that is to create a new loader with the locations you want to search set as its search path, and add this new loader into the heirarchy.

ClassLoader is an abstract class, and so can't be instantiated. Instead you'd normally use a URLClassLoader, which is basically the class to use - it does everything you would usually need. You can create your own loader classes by extending ClassLoader, but normally this is unnecessary.

The search path for URLClassLoader is provided as an array of java.net.URL objects; each URL identifies a directory or an archive file (.jar or .zip) to be searched when loading.

Let's say I have a JAR named /tmp/my-jar.jar and it contains a class called com.example.MyClass. I need to create an instance of this class. This code should do the trick:

    // First, set the search path
    URL[] searchPath = new URL[1];
    searchPath[0] = new File("/tmp/my-jar.jar")
                            .toURI()
                            .toURL();

    // Now create a new loader
    ClassLoader cl = new URLClassLoader(searchPath);

    // Now we can load from the JAR:
    Object o = Class.forName("com.example.MyClass",
                             true,
                             cl)
                    .newInstance();

A few notes about this code:

First, the URL array can contain URLs for directories as well as JAR and ZIP files. The example here has only one entry but you could provide an array containing hundreds of entries if you needed to. Note that you can't use wildcards here - each entry must point to a single archive file or directory.

Second, the loader created by the URLClassLoader constructor will have the system loader as a parent by default. You can provide a different parent as a second parameter to the constructor - this allows you to build a full-blown heirarchical tree of loaders within your application if you so wish.

Third, note the three-parameter call to Class.forName() - the first parameter is the class name, of course, as in the one-parameter call. The third parameter specifies our new loader as the one to use to load our class; the default is to use the same loader that loaded the calling class (this.getClass().getClassLoader()). The second parameter determines whether or not the class should be initialized (i.e. have its static initializer called) and you'd normally set this to true (offhand I can't think of a circumstance where you wouldn't want to to this).

Lastly, note that the new loader becomes the default for classes referenced by the newly-loaded classes. This means that MyClass can reference other classes in my-jar.jar implicitly or explicitly (i.e. using the one-parameter Class.forName() method) and the classes will be loaded correctly.

Using this you can create a structure of loaders organized as you need to implement different search paths for different requirements (for example, Tomcat uses one branch of a loader tree for its own server classes and another as a connection point for loading web applications; each webapp gets its own subtree. That's how multiple webapps can exist even with conflicting class names or versions, and without being able to access the server's internal classes).

Where the Class definitions are kept

Each loader keeps the Class objects that it loads in its own local space.

This means that if you create two loaders, each with the system loader as parent but with common directories and/or archive files in their search paths, it becomes possible to load the same class twice by invoking both class loaders to load the same class.

Other things URLClassLoader can do

To finish up, here are a couple of other useful things that you can do:

First, there's a method URLClassLoader.getURLs() that returns the loader's current search path as an array of URLs. This can be useful for debugging.

Second, loaders aren't limited to finding .class files - you can use them to find other resources that are in the search path. This applies to all loaders (i.e. ClassLoader and all its subclasses, not just URLClassLoader). This is extremely useful because it allows you to, for example, read from a property file embedded inside a JAR. Some methods that are especially useful are:

ClassLoader.getResource() - returns the URL of a named resource;

ClassLoader.getResourceAsStream() - returns an InputStream allowing you to read a named resource directly (handy for loading .properties files);

ClassLoader.getSystemResource() and ClassLoader.getSystemResourceAsStream() - static methods that do the same as the above methods, but use the system loader rather than a specific one that you may have created.

Labels: Pure Java

06 June 2008

Modifying the CLASSPATH at run time

0 Comments:

About

About Me

Previous