the
Apache Jakarta site is home to many well-known Java open source
projects, including Tomcat, Ant, and log4j. A lesser-known subproject
of Jakarta is Jakarta Commons, a repository of reusable Java
components. These components, such as Commons BeanUtils, Commons DBCP,
and Commons Logging, alleviate the pain of some standard programming
tasks. This article will focus on the Jakarta Commons Digester, a
utility that maps XML files to Java objects.
Note:
To use Digester, you must have the following libraries in your
classpath: Digester, BeanUtils, Collections, Logging, and an XML parser
conforming to SAX (Simple API for XML) 2.0 or JAXP (Java
API for XML Parsing) 1.1. Links to all Jakarta Commons components,
along with two suitable parsers, Crimson and Xerces, can be found in Resources.
XML parsing overview
Two basic methods parse XML documents.
One is the Document Object Model (DOM) method. When parsing an XML
document with DOM, the parser reads the entire document and creates a
tree-like representation of it. The second method uses SAX and parses
XML documents with events. The DOM method, while sometimes easier to
implement, is slower and more resource-intensive than SAX. Digester
simplifies SAX parsing by providing a higher-level interface to SAX
events. This interface hides much of the complexity involved in XML
document navigation, allowing developers to concentrate on processing
XML data instead of parsing it.
Digester concepts
Digester introduces three important concepts: element matching patterns, processing rules, and the object stack.
Element matching patterns associate XML elements with processing rules. The following example shows the element matching patterns for an XML hierarchy:
<datasources> 'datasources'
<datasource> 'datasources/datasource'
<name/> 'datasources/datasource/name'
<driver/> 'datasources/datasource/driver'
</datasource>
<datasource> 'datasources/datasource'
<name/> 'datasources/datasource/name'
<driver/> 'datasources/datasource/driver'
</datasource>
</datasources>
Each time a pattern is matched, an associated rule is fired. In the above example, a rule associated with 'datasources/datasource'
executes twice.
Processing rules
define what happens when Digester matches a pattern. Digester includes
predefined processing rules. Custom rules can also be created by
subclassing org.apache.commons.digester.Rule
.
The object stack
makes objects available for manipulation by processing rules. Objects
can be added or removed (pushed or popped) from the stack either
manually or through processing rules.
Using Digester
Digester is often used to parse XML configuration files. In the
following examples, we have an XML configuration file that contains
information used to build a DataSources
pool. The DataSource
is a hypothetical class that has an empty constructor and many get and set methods that take in and return strings.
<?xml version="1.0"?>
<datasources>
<datasource>
<name>HsqlDataSource</name>
<driver>org.hsqldb.jdbcDriver</driver>
<url>jdbc:hsqldb:hsql://localhost</url>
<username>sa</username>
<password></password>
</datasource>
<datasource>
<name>OracleDataSource</name>
<driver>oracle.jdbc.driver.OracleDriver</driver>
<url>jdbc:oracle:thin:@localhost:1521:orcl</url>
<username>scott</username>
<password>tiger</password>
</datasource>
</datasources>
To use Digester you must create an instance of the Digester
class, push any required objects to the Digester's object stack, add a
set of processing rules, and finally parse the file. Here is an
example:
Digester digester = new Digester();
digester.addObjectCreate("datasources/datasource", "DataSource");
digester.addCallMethod("datasources/datasource/name","setName",0);
digester.addCallMethod("datasources/datasource/driver","setDriver", 0);
digester.parse("datasource.xml");
In this example, the addObjectCreate()
method adds an ObjectCreateRule
to the 'datasources/datasource'
pattern. The ObjectCreateRule
creates a new instance of the DataSource
class and pushes the instance to the Digester's object stack. Next, addCallMethod()
adds a CallMethodRule
to two patterns. The CallMethodRule
calls the specified method of the object at the top of the object stack. The last addCallMethod()
argument is the number of additional parameters to be passed into the
method. Since the number is zero, the matching element's body passes to
the method.
If this code runs against our sample XML file, here's what happens:
- A new instance of the
DataSource
class is created and pushed to the object stack
- The
setName(String name)
method of the newly instantiated DataSource
object is called with the argument 'HsqlDataSource'
- The
setDriver(String driver)
method of the newly instantiated DataSource
object is called with the argument 'OracleDataSource'
- At the end of the
'datasource'
element, the object pops off the stack, and the process repeats itself
The problem with this example is that the ObjectCreateRule
pops off the object it creates when its associated element completes.
When Digester finishes parsing the document, only the last object
created remains. Solve this problem by pushing an object to the stack
before parsing begins, and then call that object's methods to create
any objects you need. The following class provides an example of this:
public class SampleDigester
{
public void run() throws IOException, SAXException
{
Digester digester = new Digester();
// This method pushes this (SampleDigester) class to the Digesters
// object stack making its methods available to processing rules.
digester.push(this);
// This set of rules calls the addDataSource method and passes
// in five parameters to the method.
digester.addCallMethod("datasources/datasource", "addDataSource", 5);
digester.addCallParam("datasources/datasource/name", 0);
digester.addCallParam("datasources/datasource/driver", 1);
digester.addCallParam("datasources/datasource/url", 2);
digester.addCallParam("datasources/datasource/username", 3);
digester.addCallParam("datasources/datasource/password", 4);
// This method starts the parsing of the document.
digester.parse("datasource.xml");
}
// Example method called by Digester.
public void addDataSource(String name,
String driver,
String url,
String userName,
String password)
{
// create DataSource and add to collection...
}
}
In the SampleDigester
class, the addDataSource()
method is called each time the pattern 'datasources/datasource'
is matched. The addCallParam()
methods add CallParamRules
that pass the matching elements' bodies as addDataSource()
method parameters. In the addDataSource()
method, you create the actual DataSource
and add it to your collection of DataSources
.
Digest the Digester
Although Digester was initially developed to simplify XML-configuration
file parsing, it is useful any time you need to map XML files to Java
objects. This article has provided an introduction to Digester. To
learn more about Digester and other Jakarta Commons components, visit
the Jakarta Commons Website. In addition, look at the open source
projects in the Resources
section below for real-world examples of Digester in action. You can
also download the source code that accompanies this article below.