Wednesday, December 7, 2011

Setting up command line Apache Xalan XSLT processor

You need to put the Xalan jars in the System classpath, create or edit the environment variable called CLASSPATH, set it with a value like this

CLASSPATH=C:\Projects\Java\Runtime\lib\xalan.jar;C:\Projects\Java\Runtime\lib\xml-apis.jar;C:\Projects\Java\Runtime\lib\serializer-2.7.0.jar;

Ultimately you need three JARs on the classpath:
  • xalan.jar
  • xml-apis.jar
  • serializer-#.#.#.jar
You cannot take a shortcut and just specify a folder containing these jars, you must give java the full path to them. For a permanent solution put the three jars in the \lib\endorsed directory, then the classpath will always have them.

You should already have a JRE or JDK in the system PATH as well:

set PATH=C:\SDKs\jdk1.6.0_18\bin;%PATH%

With that in place you can transform XML using XSLT on the command line with the Xalan parser like this:

java org.apache.xalan.xslt.Process -out output.out -in input.xml -xsl transform.xslt

Where
output.out is the file to create,
input.xml is the input xml file to transform, and
transform.xslt is the XSLT XML file to transform the input xml file with.

This is handy because you can also give the JVM 2GB of heap memory like this to process large input files:

java -Xms2g -Xmx2g org.apache.xalan.xslt.Process -out output.out -in input.xml -xsl transform.xslt

If you are runnng into out of memory heap space errors parsing XML with XSLT, this is a good way to get past that problem. If you cannot get further beyond your out of memory errors, then look at breaking apart the XML with Java or Perl somehow before transforming it with XSLT. You might even be able to use 64-bit JVM to address a huge amount of memory much larger than you probably have available in the physical machine.

No comments:

About Me

My photo
Lead Java Developer Husband and Father

Tags