Overview
As defined by the members of the XML:DB mailing list, a native XML database is one that:
Defines
a (logical) model for an XML document -- as opposed to the data in that
document -- and stores and retrieves documents according to that model.
At a minimum, the model must include elements, attributes, PCDATA, and
document order. Examples of such models are the XPath data model, the
XML Infoset, and the models implied by the DOM and the events in SAX
1.0.
Has an XML document as its fundamental
unit of (logical) storage, just as a relational database has a row in a
table as its fundamental unit of (logical) storage.
Is
not required to have any particular underlying physical storage model.
For example, it can be built on a relational, hierarchical, or
object-oriented database, or use a proprietary storage format such as
indexed, compressed files.
Native XML databases fall into two broad categories:
-
Document-based storage Store the entire document in text
or binary form and provide some sort of database functionality in
accessing the document. A simple strategy for this might store the
document as a BLOB in a relational database or as a file in a file
system and provide XML-aware indexes over the document. A more
sophisticated strategy might store the document in a custom, optimized
data store with indexes, transaction support, and so on.
-
Node-based storage Store individual nodes of the document
(such as the DOM or a variant thereof) in an existing or custom data
store. For example, this might map the DOM to relational tables such as
Elements, Attributes, Entities or store the DOM in pre-parsed form in a
data store written specifically for this task. This includes the
category formerly known as "Persistent DOM Implementations".
There are two major differences between the two strategies.
First, document-based storage can exactly round-trip the document, down
to such trivialities as whether single or double quotes surround
attribute values. Node-based storage can only round-trip documents at
the level of the underlying document model. This should be adequate for
most applications but applications with special needs in this area
should check to see exactly what the database supports.
The
second major difference is speed. Document-based storage obviously has
the advantage in returning entire documents or fragments. Node-based
storage probably has the advantage in combining fragments from
different documents, although this does depend on factors such as
document size, parsing speed (for document-based storage), and
retrieval speed (for node-based storage). Whether it is faster to
return an entire document as a DOM tree or SAX events probably depends
on the individual database, again with parsing speed competing against
retrieval speed.
Native XML databases differ from XML-enabled databases in three main ways:
-
Native XML databases can preserve physical structure (entity
usage, CDATA sections, etc.) as well as comments, PIs, DTDs, etc. While
XML-enabled databases can do this in theory, this is generally not
(never?) done in practice.
-
Native XML databases can store XML documents without knowing
their schema (DTD), assuming one even exists. Although XML-enabled
databases could generate schemas on the fly, this is impractical in
practice, especially when dealing with schema-less documents.
-
The only interface to the data in native XML databases is XML
and related technologies, such as XPath, the DOM, or an XML-specific
API, such as the XML:DB API. XML-enabled databases, on the other hand, offer direct access to the data, such as through ODBC.
For more information about native XML databases, see "Native XML Databases".
Related categories
- Content (Document) Management Systems:
Applications built on top of native XML databases and/or the file
system for content/document management. Include features such as
check-in/check-out, versioning, and editors.
- XML-Enabled Databases:
Databases with extensions for transferring data between XML documents
and themselves. Some of these also support native XML storage.
Products
4Suite, 4Suite Server
Developer: FourThought
URL: http://4suite.org/index.xhtml
License: Open Source
Database type: Object-oriented
Entry last updated: November, 2001
From the Web site:
"4Suite is a collection of Python tools for XML processing and object
database management. It is an integrated package of several components:
4DOM, DbDom, 4XPath, 4XSLT, 4XPointer, 4XLink, 4RDF and 4ODS. All the
tools are designed for maximum extensibility through custom Python
code."
"DbDOM is a DOM implementation that is stored persistently in
a 4ODS object database in order to support arbitrarily large documents
and applications with specialized persistent needs."
"4XPath is a library implementing the W3C's full XPath 1.0
specification for indicating and selecting portions of an XML
document."
"4Suite Server is a platform for XML processing. It is an XML
data repository with a rules-based engine. It supports DOM access, XSLT
transformation, XPath and RDF-based indexing and query, XLink
resolution and many other XML services. It also provides support
services such as distributed transactions, and access control lists. It
supports remote, cross-platform and cross-language access through
CORBA, SOAP and HTTP GET."
Berkeley DB XML
Developer: Oracle (formerly owned by Sleepycat Software)
URL: http://www.oracle.com/database/berkeley-db/xml/index.html
License: Open Source
Database type: Key-value
Entry last updated: December, 2005
Berkeley DB XML is a native XML database built on top of Berkeley
DB, adding an XML parser, XML indexes, and an XQuery engine. From
Berkeley DB it inherits a storage engine, transaction support
(including XA), automatic recovery, and other features.
Berkeley DB XML stores XML documents in logical groupings
called containers, which are the same as collections in other native
XML databases. Users can specify a number of properties on a
per-container basis, including whether to validate documents, whether
to store documents whole or as individual nodes, and what indexes to
create (element, attribute, or metadata). It is worth noting that
schemas are specified through schemaLocation hints in documents, rather
than being associated with the container as a whole.
In addition to storing XML documents, Berkeley DB XML can
store non-XML documents (in the underlying Berkeley DB data store) as
well as metadata for XML documents. The latter take the form of
user-specified property-value pairs and can be queried as if they were
child elements of the root element, although they do not actually
appear in stored XML documents.
Berkeley DB XML supports XQuery as its query language. It
provides an API for updating documents that uses XQuery to identify a
set of nodes to update and allows users to append a new child to a
target node, insert a new node before or after a target node, remove a
target node, rename a target node, or change the value of a target
node. Updates are performed at the node level.
Like Berkeley DB, Berkeley DB XML is a library that is linked
directly to applications, rather than being used in client-server mode.
It has a command-line interface as well as APIs for C++, Java, Tcl,
Perl, Python, and PHP. Third-party APIs for other languages are
available as well.
Birdstep RDM XML
Developer: Birdstep
URL: http://www.birdstep.com/database_technology/rdm_xml.php3
License: Commercial
Database type: Object-oriented
Entry last updated: October, 2002
Birdstep RDM XML is a native XML database built on top of the
Birdstep RDM Mobile database. Documents can be stored in one of two
ways. DTD-less documents are stored using a set of predefined classes
that roughly correspond to the information items defined in the XML
Information Set. Documents with DTDs can be stored with these classes
or with classes generated from the DTD. The latter classes provide an "XML data binding"
solution; that is, binding of XML documents to DTD-specific classes. In
addition, they inherit from the generic classes, so they can also be
viewed generically.
Birdstep RDM XML implements XPath as its query language. XPath
queries can be compiled and stored in the database for faster
re-execution. It also provides implementations of DOM and SAX. The DOM
interface is available in both Java and C++ and is live. That is,
changes made through the DOM are immediately visible to other users.
The SAX implementation uses Expat to parse incoming documents; the
database itself can act as a SAX Reader for outgoing documents.
The Birdstep RDM Mobile database on which Birdstep RDM XML is
based is an "ultra small" database designed for use on handheld devices
and in embedded systems. It stores data as "molecules", each of which
consists of two or three "atoms": a content atom (which contains the
actual data value when it is longer than 4 bytes), an instance atom
(which contains data values of 4 bytes or less, or pointers to data
values longer than 4 bytes), and a type atom (which defines the type of
an instance atom). Because content atoms are reused, no data longer
than 4 bytes is stored more than once. In addition, content atoms
contain pointers to all instances, so they serve as indexes. By
constructing chains of instances and links between chains, Birdstep RDM
Mobile can store virtually any kind of data structure, including XML.
Birdstep RDM Mobile supports object inheritance,
encapsulation, and polymorphism; indexes; queries based on object type,
value, or hierarchical relation; and transactions.
Centor Interaction Server
Developer: Centor Software Corp.
URL: http://www.centor.com/solutions/technology.shtml
License: Commercial
Database type: Proprietary
Entry last updated: September, 2001
Centor Interaction Server consists of the following parts. (Description quoted from the Web page.)
"Interaction Store - ... Data resides in the Interaction Store
in a structured and unstructured layout, using XML documents with a
fast and scalable indexing technology. ..."
"Data Engines - The Centor Interaction Server has four engines used to manage and manipulate data in the Interaction Store. ..."
"o The Query Engine gives users the ability to categorize,
organize and search data ... including capabilities for popular
decision support features such as "drill-down", query by range of
values, compare/contrast, and content export to other applications via
XML data interchange. ...
"o The Index Engine indexes all data stored in the Interaction
Store. It supports advanced searching capabilities that provide
intelligent search
criteria via either metadata, attribute and value, or text keywords
..."
"o The Update Engine is a command-driven application that is
used to create or modify XML entries using the Centor Data Exchange
Language (CDXL). CDXL is a neutral file format used to exchange data
between the Interaction Server and external sources."
"o The Data Processing Engine provides complex decision
support functionality such as rules-based query, traffic lighting, and
complicated mathematical calculations."
"Security Engine - The Interaction Server provides a complete
security policy management infrastructure for Web-based enterprise
applications. It manages end user information including user name,
password, departments and the role of each user. ..."
"Data Access (API) - The Data Access layer provides two-way
content access and integration capabilities. The Interaction Server
offers a number of Application Programming Interfaces (APIs), including
URL, C++, and EJB programming interfaces. This layer also provides ODBC
and JDBC interfaces used to access data stored in relational
databases."
Centor Interaction server also includes a workflow engine, styling engine, and presentation manager.
DBDOM
Developer: K. Ari Krupnikov
URL: http://dbdom.sourceforge.net/
License: Open Source
Database type: Relational
Entry last updated: November, 2000
DBDOM is an implementation of the DOM over a relational database,
using a fixed set of tables to store the DOM tree in the database. DOM
methods are implemented as stored procedures, but also included are a
set of adapters so these can be called from Java. The initial version
will run on PostgreSQL, with later versions planned for Oracle, DB2,
and Microsoft SQL Server.
dbXML
Developer: dbXML Group
URL: http://www.dbxml.com/product.html
License: Open Source
Database type: Proprietary
Entry last updated: March, 2004
dbXML is a native XML database that supports four different data
stores. The first of these is a proprietary data store that uses B
trees. The second is an in-memory data store, which is used for
temporary storage and whose contents are deleted when the database is
stopped. The third is the file system. And the fourth is a mapping to a
relational database (it is not known what mapping is used). Which data
store to use is specified on a per-collection basis.
dbXML has a directory-like collection model. Collections can
be nested and can store documents that match any XML schema, although
it is suggested that a single collection contain documents that match a
single XML schema to simplify indexing and querying. Collections can
also contain binary streams (such as JPEG files), although a collection
cannot contain both binary streams and XML documents.
dbXML supports XPath, XSLT, XUpdate,
and full-text searches. XPath and XSLT have been extended for use
against collections, and both XSLT and full-text searches can be run
against the results of an XPath query.
dbXML supports three different types of indexes. Name indexes
index element and attribute names. Value indexes index element and
attribute values and support strings, characters, bytes, integers, real
numbers, and booleans. Full text indexes index tokens in element and
attribute values. They are case insensitive and actually index word
stems; for example, both "happening" and "happen" have the same stem.
Individual indexes are associated with a particular collection and
users specify what to index according to an XPath-like expression.
dbXML supports triggers. These are user-specified Java classes
that can be fired before or after an insert, update, delete, or data
retrieval. They can be used to do such things as validating documents
on insertion or modifying documents on retrieval. dbXML also supports
extensions to the server through Java classes.
dbXML supports transactions and security. Security options are
no security, a single user name and password for the entire database,
and role-based security (the default).
dbXML has four different APIs: the direct API, the client API, XML:DB,
and Web services. The direct API allows applications to work directly
with dbXML. The client API allows applications to use dbXML in
client-server fashion. This can be done where both client and server
are in the same process, or through XML-RPC. The Web services interface
supports both XML-RPC and REST (URL encoding).
dbXML comes with a set of command line tools for connecting to
the database, managing collections, indexes, security, triggers, and
extensions, and storing and retrieving documents.
NOTE: dbXML is a complete rewrite of the code that became Xindice and is therefore different from that product.
Dieselpoint
Developer: Dieselpoint, Inc.
URL: http://www.dieselpoint.com/xmlsearch.html
License: Commercial
Database type: None (indexes only)
Entry last updated: January, 2007
Dieselpoint is a search engine, not a native XML database. It
indexes documents and data specified by the user and then executes
queries against those indexes. Dieselpoint is written in Java will run
in any J2EE-compliant application server. It is designed to be called
from a user-written application and its API is designed with such
applications in mind. For example, it returns metadata about search
results so applications can dynamically create user interfaces relevant
to those results. Applications can call Dieselpoint through a Java API,
a JSP front end, JDBC, or XML. For users who do not want to write their
own application, Dieselpoint ships with a number of sample applications
(including a product catalog application) and a generic, JSP-based user
interface that is "suitable for common uses".
Dieselpoint indexes documents and data retrieved by a crawler
from Web sites, directories, and databases. It can index documents
(XML, HTML, PDF, Microsoft Office), databases (via JDBC), and flat
files (comma-separated, tab-separated, and so on). Data in other
formats can be indexed via calls to a user-implemented API. The indexer
extracts data in the form of attributes, such as document metadata, XML
elements and attributes, and database columns. A preprocessor allows
user-written code to modify, categorize, or reject items before they
are indexed.
Dieselpoint uses a proprietary query language, which supports
full-text and parametric searching. (Parametric searching limits a
search to a particular attribute, such as a title, part number, or
description.) Search clauses can be joined in any way by AND, OR, NOT,
and parentheses, and can include comparisons (=, >, >=, <,
<=, <>), wildcards, and regular expressions. Full-text
features include stemming, thesauruses, stop words, misspellings,
relevance, hit highlighting, and support for 40 languages and 140
dialects. Search results can be returned as a JDBC result set or XML
document and can be sorted by relevance or attribute value.
XML-specific features include searching by element or
attribute and by XML path. (The indexer preserves the XML hierarchy.)
The query engine can return complete documents or fragments, and can
also treat fragments of a document (headed by a particular element
name) as separate documents. Dieselpoint understands both ECCMA (an XML language for catalogs) and Dublin Core and provides special processing for both. In addition, it can handle XMP metadata (RDF documents) embedded in PDF documents.
Dieselpoint includes an adminstrator for performing such tasks as
managing indexes, defining data sources, and scheduling the crawler. It
also contains a Web server and servlet container.
DOMSafeXML
Developer: Ellipsis
URL: http://www.ellipsis.nl/content/products.htm
License: Commercial
Database type: File system(?)
Entry last updated: June, 2004
DOMSafeXML is a main-memory native XML database that stores XML
files on disk and monitors them "for external changes". It supports
XPath, SAX, DOM level 2, and the XML:DB API, with language bindings for
COM, C++, Java, and C#. DOMSafeXML supports multi-user access through
transactions and node-level locking and comes with a built-in Web
server.
eXist
Developer: Wolfgang Meier
URL: http://exist.sourceforge.net
License: Open Source
Database type: Proprietary
Entry last updated: March, 2004
eXist is a native XML database that uses a proprietary data store (B+
trees and paged files). It can be run as a standalone database server,
as an embedded database, or in the servlet engine of a Web application.
Documents are stored in a hierarchy of collections. Collections can
contain child collections and do not constrain documents to any
particular schema or document type.
eXist supports XQuery/XPath 2.0 and XQuery statements can
query any combination of collections and documents. eXist does not
support strong data typing but does provide a number of extensions to
XQuery. In particular, eXist's implementation of XQuery can execute
full text searches, call the XML:DB API (such as to store query results
in the database), execute dynamically constructed XQuery statements,
apply XSLT stylesheets to a node, work with HTTP, and execute arbitrary
Java methods. eXist also provides partial support for XInclude and
XPointer.
Updates are primarily supported through XUpdate. When eXist is being used as an embedded database, live DOM trees are supported as well. eXist supports the XML:DB API,
with additional services for preparing and executing XQuery statements,
managing users, managing multiple database instances, and querying
indexes. DOM and SAX are supported for documents returned through the
XML:DB API. eXist can also be called via XML-RPC, a REST-style Web
services API, SOAP, and WebDAV.
eXist automatically indexes all element and attribute
structure. By default, it creates full text indexes over all text and
attribute values, but users can turn this off for selected parts of a
document. It supports concurrent read/write access for multiple users,
as well as access control at both the collection and document level. It
does not currently support transactions.
Of note, eXist has complete documentation.
eXtc
Developer: M/Gateway Developments Ltd.
URL: http://www.mgateway.tzo.com/php/mgw/extc.php
License: Commercial
Database type: Post-relational (Cache)
Entry last updated: March, 2004
eXtc is a native XML database built on top of the Cache
database. It provides an implementation of the DOM over Cache, storing
documents as DOM objects. Because eXtc is written in Cache ObjectScript
(Cache's extension of the MUMPS programming language), the DOM
implementation inherits the features of that language, such as integral
support for transactions, multi-user access, and remote access. DOM
level 2 is supported, with additional support for the Abstract Schemas
and Load and Save features of DOM level 3.
eXtc supports XPath queries over individual documents, as well
as SQL queries against the Cache tables used to store DOM trees. The
latter is useful for SQL-like queries -- for example, finding the value
of all CustomerName elements -- as well as finding the IDs of documents
that match a certain criteria.
eXtc also supports XSL-FO, SVG (through a library of functions
for creating SVG documents), WebDAV, and HTTP access through Cache's
WebLink module. In addition, it includes client and server
implementations of SOAP and WSDL, which allow Cache applications to be
exposed as Web services and to be integrated with Web services.
Extraway
Developer: 3D Informatica
URL: http://www.3di.it/h3/h3/aSito_3DI/finizio (Italian)
http://www.3di.it/nuovo/html/extraway.pdf (English, PDF)
License: Commercial
Database type: Files plus indexes
Entry last updated: August, 2005
From the company:
"Extraway is a native XML database that is designed to preserve
data as "Information Units", which are objects defined by the database
administrator and which use an XML data model. By default, information
units correspond to the root element of a document, but can also
correspond to lower-level elements. For example, a single document may
contain multiple bibliographic records where each record is considered
to be a single information unit."
"Extraway supports synchronous and asynchronous use cases. In
the synchronous case, client applications create and retrieve
information units. The engine receives XML information units and stores
them on a private area of the file system. The storage process is
configured by the database administrator, who determines the
aggregation policy, the directory, and the file name settings. For
example, the administrator can organize the file system by year,
department, and classification and arrange information units of a given
type in the same XML document."
"During the aggregation process, Extraway adds system metadata
for each unit: time/username of submission/modification, an integrity
hash, and the current versions of the organization's structure, DTD/XML
Schema, classification plan, and client software."
"Extraway also manages multimedia objects, storing them in the
same directory as the corresponding information unit. For the most
common formats, the text is extracted and indexed, and metadata like
size, resolution, compression, duration, and hash code are extracted
and added to the information unit."
"In the asynchronous case, Extraway monitors XML files at a given network address and simply indexes them."
"Indexes are built relative to the root of each information
unit. Each index has a specific type (string, number, or date) and can
index the entire content of an element or attribute or individual
tokens within that content. Indexes can also concatenate individual
values, or be created from custom code that is run at index time.
Indexes can be built on demand, at regular intervals, or, by default,
in response to events such as adding, deleting, and modifying
information units."
"Extraway has a proprietary query language that allows users
to combine path expressions with boolean operators. Path expressions
can be declared and aliased at design time. This allows path details to
be abstracted, which is useful when merging different paths in the same
model and in handling different DTD / XML Schema versions. The language
supports equality, arithmetic, and full-text operators. Query results
are returned as a named result set, which can be browsed, refined,
referenced in other queries, or made persistent."
"Extraway can also be queried with SQL, which is used for its
joining operators: when the selected columns return non-repetitive
values, the result is a trivial table having information unit as rows;
in the other cases the result is an array of information unit
identifiers."
"Extraway includes a GUI-based DTD editor, an administration
console, Java, .Net, and Web services APIs, and a JDBC driver. Other
features include support for thesauruses and encryption of XML units
stored on the file system."
GoXML DB
Developer: XML Global
URL: http://www.xmlglobal.com/prod/db/index.jsp
License: Commercial
Database type: Proprietary (Model-based)
Entry last updated: May, 2003
GoXML DB is the same as XStreamDB. For more information, see the XStreamDB entry.
Infonyte DB (formerly PDOM)
Developer: Infonyte
URL: http://www.infonyte.com/prod_db.html
License: Commercial
Database type: Proprietary (Model-based)
Entry last updated: February, 2002
Infonyte is a native XML database built from two components:
Infonyte PDOM (Persistent DOM) and Infonyte XQL (which can be purchased
separately). Infonyte PDOM is a storage engine for storing the XML
documents in indexed, binary files. The PDOM engine provides an
implementation of the DOM over these files. The DOM implementation can
handle arbitrarily large documents because it swaps DOM nodes to disk
as needed. It includes defragmentation and garbage collection
facilities, commit points (for writing the in-memory tree to disk),
file compression with gzip, and thread-safe operation.
Infonyte XQL is an implemenation of XQL with extensions for
variables, multi-document queries, restructuring of query results,
full-text search, result construction, and sequencing. It is
addressable through HTTP.
Version 2.0 (feature complete and in beta as of December,
2001) includes support for XPath, DOM Level 2, and XSLT. XSLT support
is provided by Xalan, which has been extended to work directly on data
in the database.
Ipedo
Developer: Ipedo
URL: http://www.ipedo.com/html/products.html
License: Commercial
Database type: Proprietary
Entry last updated: August, 2003
Ipedo consists of three main components: the XML Database (provides
native XML storage and XQuery engine), the Integration Manager
(integrates external data using XML), and the Web Express (creates
output documents).
The XML Database is a native XML database that uses a
proprietary data store. It supports XQuery, indexing, schema
management, a proprietary linking language, collections, and
transactions. The XQuery engine can query documents in the native data
store as well as views built over external data sources (see below). It
supports proprietary extensions for updates and full-text searches and
uses XML Schemas if they are available. The linking language is used to
build virtual documents. It uses URLs to include documents or fragments
stored on the Web, in the native data store, or in views. Links may be
parameterized. The Schema Manager supports XML Schemas and DTDs, which
are converted to XML Schemas. If a schema is associated with a
collection, documents in that collection are validated on insert and on
update. The Schema Manager also supports versioning.
The Integration Manager has three components: XML Views,
Adapters, and Content Converters. XML Views provide dynamic, read-only
access to external data. They are supported for relational databases
(using a table-based mapping?), SOAP, HTTP, and XML documents stored in
local or remote Ipedo databases. Views are accessed in XQuery
statements via the proprietary view() function and are evaluated at
query time. Adapters synchronize local copies of external data with
their external source. They consist of an inbound adapter (running in
Ipedo) and an outbound adapter (running in the external source).
Adapters are run by triggers in both Ipedo and the external source and
use a proprietary XML protocol that runs over HTTP or JMS. Adapters are
available for Oracle and DB2; users can write their own Adapters as
well. Content Converters provide static, read-only access to external
documents. They convert documents from a variety of formats (Word, PDF,
etc.) into XML documents, which are then stored in the XML Database.
The Web Express provides a User Profile Manager, a
Transformation Engine, Pipelines, and Web Tags. The User Profile
Manager allows designers to customize views of the data on a per-user
basis. The Transformation Engine is an index-aware XSLT engine.
Pipelines are a named set of transformations; these can be performed by
the Transformation Engine or components written by the user. Pipelines
can use any source of XML as input (an XML document, an XQuery
statement, another Pipeline, etc.) and send the output to a specific
destination, such as a Web site or a specific application. Pipelines
can be parameterized. Web Tags are tag library for writing Java Server
Pages (JSPs) that use Ipedo.
Ipedo is written in Java and is accessible from Java, EJBs,
COM, SOAP, and WebDAV. It provides security through users, groups, and
access control lists; supports journaling, on-line backups, and
external clustering mechanisms; and allows read-only replicas of the
data store to be deployed. It comes with GUI-based design and
administration tools.
Lore
Developer: Stanford University
URL: http://www-db.stanford.edu/lore/home/index.html
License: Research
Database type: Semi-structured
Entry last updated: November, 2000
Semi-structured data is data with more structure than a
conversation, but less structure than a telephone book. A good example
is a resume (curriculum vitae). While virtually all resumes include a
name, address, and telephone number, only some will include an email
address, Web site, or FAX number. Most will include a list of previous
jobs, but others might include only a list of university courses.
Depending on the profession, there might be a list of software used or
licenses held.
XML is well-suited to storing semi-structured data and shares
a feature common to many semi-structured data models: it is
self-describing. That is, it carries a certain amount of metadata with
the data. In the case of XML, this is in the form of element type and
attribute names. The legality of well-formed documents mirrors another
feature found in many semi-structured data models: the data model is
not required to have a definitive schema, and the model can be extended
at will by the addition of new fields.
Lore is a database designed for storing semi-structured data.
Although it predates XML, it has recently been migrated for use as an
XML database. It includes a query language (Lorel), multiple indexing
techniques, a cost-based query optimizer, multi-user support, logging,
and recovery, as well as the ability to import external data. Because
Lore is designed for use with semi-structured data, XML documents
without DTDs can be easily stored.
An interesting feature of Lore is a DataGuide, which is a
"structural summary of all paths in the database". Unlike structured
databases, in which the structure is specified first and data is added
according to that structure, data is entered first into Lore and the
structure is then summarized. The resulting information useful for
query processing.
The Lore executables are "available for public use". Source code may be available in some circumstances.
MarkLogic Server (formerly Cerisent Content Interaction Server)
Developer: Mark Logic Corp.
URL: http://www.marklogic.com/products/index.html
License: Commercial
Database type: Proprietary (?)
Entry last updated: June, 2004
MarkLogic Server is a native XML database that uses a
proprietary(?) data store. It stores all content as XML, converting
documents from formats such as Microsoft Office, PDF, and StarOffice
when they are loaded. At load time, documents are indexed with both
full-text and parametric (structured?) indexes. They can be stored with
"configurable levels of document fidelity". Presumably, this means that
users can choose which types of XML structures to store. For example,
users might be able to discard comments, processing instructions,
insignificant white space, and sibling order from data-centric
documents.
MarkLogic Server supports XQuery, with extensions for
full-text queries, updates, and transaction handling. Updates can be
performed at the node level through the XQuery extensions or a
proprietary API, and can be grouped together into a single transaction.
Queries are "lock-free" and journaling is performed to allow recovery
in the event of a system crash. Content Interaction Server supports XML
Schemas as well.
Applications interact with Content Interaction Server through
the Java XDBC API or an integrated HTTP server. In addition, the server
can be customized through the Services Component Adapter Layer.
myXMLDB
Developer: Mladen Adamovic
URL: http://sourceforge.net/projects/myxmldb/
License: Open Source
Database type: MySQL
Entry last updated: January, 2005
myXMLDB is a native XML database implemented on top of MySQL. It
stores documents as BLOBs and can store documents up to 256 MB in size.
It supports XPath and XQuery through Saxon and provides a Java
implementation of the XML:DB API. A GUI interface is provided through
XMLdbGUI.
Natix
Developer: data ex machina
URL: http://www.dataexmachina.de/natix.html
http://pi3.informatik.uni-mannheim.de/~moer/natix.html (in German)
License: Commercial
Database type: Proprietary(?)
Entry last updated: June, 2004
Natix is a native XML database designed to:
"...support compact storage of structure and content of XML
documents, index structures for content and structure retrieval,
validation, recovery, isolation of multiple users that work on the same
document(s), query evaluation, a rich set of application programming
interfaces (APIs) for languages like C++, Java, and support for legacy
applications."
Although the Web site is unclear about the status of Natix, it is a released product and is the engine behind Xyleme Zone Server. It is primarily designed to be embedded in applications or other systems, but can be bought separately.
NaX Base (formerly Lucid XML Data Manager)
Developer: Naxoft (bought from Lucid'i.t.)
URL: http://www.naxoft.com/produit-presentation.html (in French)
License: Commercial
Database type: Proprietary
Entry last updated: May, 2004
NaX Base is a native XML database built on a proprietary data
store, which maintains both node-level and full-text indexes. Of
interest, indexes are optimized asynchronously, which might allow for
faster updates. Documents can be organized in collections, which can be
nested inside each other.
The query language is an extended version of XPath which
supports queries along the hierarchy of collections as well as a grep
operator for doing full-text searches. Full-text search features
include case-sensitive and -insensitive searches, wildcards, and
searches that omit specific words. Updates are done through an API with
methods such as insertBefore, insertAfter, and appendChild. These
methods accept the new node and an XPath expression identifying the
location of the change.
NaX Base allows users to assign access privileges on a
per-user and per-collection basis. In addition, individual documents
can be locked by a single user. NaX Base can be run either locally or
via a network, and has both Java and COM APIs. A GUI-based
administration and development tool is provided.
Neocore XMS
Developer: Xpriori (who bought NeoCore's intellectual property)
URL: http://www.xpriori.com/index.html
License: Commercial
Database type: Proprietary
Entry last updated: Summer, 2001
From the company:
"NeoCore XML Management System (XMS) is a fully transactional
native XML database system that serves as a bi-directional web server,
accepting and returning XML documents and fragments via HTTP(S). It
supports all basic database functions, including storage, delete, copy,
and query for XML documents, and insert, modify, and query for XML data
elements. NeoCore XMS is schema-independent and requires no database or
schema design before using the system. That is, rows, columns, tables,
fields, or indexing instructions do not need to be created before
documents are added to the database. When new documents are added to
the database, their structure and data - metadata and data - are
derived and automatically indexed. Users then can change the structure
of existing documents without database system redesign. Specific
features of NeoCore XMS include XPath-based query support, access
control, user-defined document data management, GUI-based
administration, and session control."
"NeoCore XMS uses a variant of XPath as its query language;
query responses return elements, document fragments, full documents,
and multiple documents. Queries can be made without knowing the
document structure - context can be queried for data, and data can be
queried for context. Boolean and wildcard options are fully supported
and all query results are well-formed XML. For query processing,
NeoCore XMS uses Digital Pattern Processing, a patented technology that
streamlines queries by using fixed-length icons."
"NeoCore XMS has built-in access control to set permissions at
the document or fragment level, and at user or group levels. NeoCore
XMS also supports access control by specifying IP addresses and
supporting X.509 certificates via Netscape Server."
"Interfaces to NeoCore XMS include HTTP(S)/SSL, Java, C++, and
Microsoft COM. NeoCore XMS also can interface with existing databases
through the X-Aware data integration tool."
ozone
Developer: ozone-db.org
URL: http://ozone-db.org/frames/home/what.html
License: Open Source
Database type: Object-oriented
Entry last updated: March, 2004
From the Web site:
"ozone is a fully featured, object-oriented database management
system completely implemented in Java ... ozone includes a fully W3C
compliant DOM implementation that allows you to store XML data. You can
use any XML tool to provide and access these data. Support classes for
Apache Xerces-J and Xalan-J are included."
"Besides the native API, ozone provides a ODMG 3.0 interface.
Although not fully ODMG compliant it helps you to port applications
to/from ozone."
"ozone does not depend on any back-end database or mapping
technology to actually save objects. It contains its own clustered
storage and cache system to handle persistent Java objects."
"[ozone] includes the following features:
o multi-user, multi-thread support
o object level access rights
o fully transaction based
o JTA/XA support
o deadlock recognition
o BLOB support
o XML (DOM) support
o ODMG 3.0 support
o Garbage collection"
ozone is part of the Infozone framework.
Sedna XML DBMS
Developer: Management Of Data & Information Systems, Institute for System Programming of the Russian Academy of Sciences
URL: http://modis.ispras.ru/sedna/index.htm
License: Free
Database type: Proprietary
Entry last updated: June, 2004
From the Web site:
"Sedna XML DBMS is a native full-featured data management system. It is designed having the following main goals in mind:
o Support for all traditional DBMS features (such as update and
query languages, query optimization, fine-grain concurrency control,
various indexing techniques, recovery and security),
o Efficient support for unlimited volumes of document-centric
and data-centric XML documents that may have a complex and irregular
structure,
o Full support for the W3C XQuery language in such a way that
the system can be efficiently used for solving problems from different
domains such as XML data querying, XML data transformations and even
business logic computation (in this case XQuery is regarded as a
general-purpose functional programming language)."
"[Features include:]"
o Support for the W3C XQuery language
o Support for a declarative update language
o Native XML data storage structures designed for efficient
support for both queries and updates (no underlying relational or
another DBMS). The XML data storage is based on descriptive schema
(also called DataGuide)
o JAVA API and Scheme API for application development
o Open client/server protocol over sockets that allows implementing APIs for other programming languages
o Administration via easy-to-use command line utilities"
[Ed. -- The declarative update language is based on the extensions to XQuery proposed by the W3C and Patrick Lehti.]
Sekaiju (known as Yggdrasill in Japan)
Developer: Media Fusion
URL: http://www.mediafusion.co.jp/usa/seihin/sekaiju/index.html
License: Commercial
Database type: Proprietary
Entry last updated: February, 2002
Sekaiju is a native XML database that has a proprietary data store
designed to store well-formed XML documents. This uses "baskets" and
"pockets" (the latter are "like a table" in a relational database),
supports two-byte characters, and can store documents that are up to 2
GB in size.
Sekaiju has local and remote COM interfaces, making it
accessible via Visual Basic, as well as an HTTP interface. Its query
language is XBath, a proprietary language based on XQL. Indexes are
automatically built for all nodes (element tags, attributes, and
PCDATA) in version 1.0 and for user-specified nodes in version 1.5.
Updates are supported only by replacing entire documents.
Transactions are supported through a versioning (log file) mechanism
which is designed to minimize conflicts due to reading and writing the
same document at the same time. Locking is done at the pocket level in
version 1.0 and at the pocket or document level in version 1.5.
Rollbacks occur automatically when problems occur in version 1.0; users
can also request them directly in version 1.5. Security features
include 256-bit encryption and password protection, with access
controllable at the pocket level.
Tools include a forms editor, a GUI-based management tool, backup/restore tools, and a toolkit for parallel processing.
SQL/XML-IMDB
Developer: QuiLogic
URL: http://www.quilogic.cc/
License: Commercial
Database type: Proprietary XML store plus relational store
Entry last updated: February, 2003
SQL/XML-IMDB is an in-memory database with both native XML and
relational data stores. While both data stores organize data in tables,
a "table" in the XML data store is what most other native XML databases
refer to as a collection, with one XML document per "row". Tables can
be created as either local to a particular process or shared among
processes and use compression to minimize memory use. Both types of
tables are indexed with TST-trees, which "combine the speed advantage
of a hash table with the ordered access of a binary tree", and XML
tables are also indexed with "Reverse-Lookup" and
"Token-Segment-Build-Up" mechanisms. While there does not appear to be
a way to directly store the entire database to disk, individual
relational tables can be saved as text files and individual XML tables
can be saved as XML documents.
SQL/XML-IMDB supports both XQuery and a "significant subset"
of SQL92. This allows XML queries against XML data and SQL queries
against relational data. In addition, it extends XQuery so that users
can mix XML and relational data. To do this, it allows SQL statements
in "any part of [an] XQuery statement where an expression is allowed".
From a practical standpoint, it appears that this means SELECT
statements are used anywhere except in a RETURN clause and INSERT,
UPDATE, and DELETE statements are used in RETURN clauses.
When a SELECT statement is used, the returned result set is mapped to
an XML document with a table-based mapping. That is, each row in the
result set is mapped to a <row> element and each column is mapped
to a child of that <row> element. This allows XQuery variables to
be bound to individual rows or columns in the result set. When any type
of SQL statement is used, it can include XQuery variables. For example,
these can be used in the WHERE clause of a SELECT statement to
correlate relational and XML data, or in the VALUES clause of an INSERT
statement to transfer data from XML documents to relational tables.
SQL/XML-IMDB also extends XQuery with operators to update XML
documents. Supported operations include deleting nodes, renaming nodes,
updating node values, replacing nodes, and inserting new nodes before
or after existing nodes. Note that these operations cannot be performed
inside a transaction.
SQL/XML-IMDB has a proprietary API for interacting with the
database. This includes functions for preparing and executing SQL and
XQuery statements, beginning, committing, and rolling back
transactions, transferring data between internal tables and external
files or application variables, and bi-directional iteration over
result sets. It is worth noting that XQuery results are returned in
result sets just like SQL results. Each item in an XQuery sequence is
returned as a separate column, with atomic values mapped to columns of
the appropriate data type and nodes mapped to XML strings. When an
XQuery statement returns multiple sequences, these are mapped to
multiple rows in the result set.
SQL/XML-IMDB can be used from Microsoft .NET, Visual C++,
Visual Basic, Office, and IIS/ASP, Borland C++ and Delphi, Perl, and
PHP.
Sonic XML Server (formerly eXcelon)
Developer: Sonic Software (who bought eXcelon Corp.)
URL: http://www.sonicsoftware.com/products/sonic_xml_server/index.ssp
License: Commercial
Database type: Object-oriented (ObjectStore). Relational and other data through Data Junction
Entry last updated: April, 2003
[Note: The following is a description of eXcelon's eXtensible
Information Server (XIS). Sonic Software bought eXcelon and renamed XIS
as Sonic XML Server. It is not known whether the following description
is still accurate, since the Sonic Web site has little technical
information about Sonic XML Server. Ed. -- 4/04]
eXtensible Information Server is a native XML database built
on top of ObjectStore. Documents are parsed on import, with individual
nodes stored as hierarchically linked objects. This means that
documents do not have to be parsed at run time and large documents can
be processed without having to read the entire document into memory.
Documents are not required to have a DTD or conform to a predescribed
schema. They can be indexed using both value and structural indexes.
(Value indexes index element and attribute values; structural indexes
index element and attribute names.) They can also be arranged in
collections; these can be nested, resulting in a file system metaphor.
eXtensible Information Server supports queries through XQuery,
XPath with extension functions, and a proprietary update language
(updategrams). Updategrams consist of an XPath to a node, an operation
on that node (insert before/after, update, delete), and any data needed
to carry out the operation. As an add-on, eXtensible Information Server
supports full-text search through the Verity engine. Users pass queries
(using Verity's query language) to eXtensible Information Server, which
passes them to Verity. Verity executes the queries (using its own
indexes) and returns pointers to the relevant documents in eXtensible
Information Server.
eXtensible Information Server supports two kinds of
server-side functions, which can be written in Java, VB, or COM. The
first, known as server extensions,
run inside the current transaction and are commonly used in XPath
expressions or as triggers associated with inserts, updates, or
deletes. These can directly manipulate data in the cache using a
server-side DOM implementation. The second, known as servlets,
must define their own transaction boundaries and are generally used to
implement extensions to the database as a whole, such as a JMS queue.
eXtensible Information Server also supports a concept called
"Binder Documents". This allows users to link existing documents as
well as to build virtual documents that consist of nothing but links.
Links are traversed transparently during queries and update operations,
which means that virtual documents can be used to perform queries and
updates over multiple documents in a single operation. Note that the
application must currently enforce the referential integrity of links
(such as through triggers). That is, it must ensure that the
document/fragment to which a link points actually exists.
eXtensible Information Server can integrate backend data through the XConnects Integration Engine, which uses the Data Junction Universal Translation Suite.
This provides links to many different data formats, including
relational databases. Because the links are two-way, it means that
backend data sources can be updated through eXtensible Information
Server. Users can also write their own XConnects connectors with a Java
API, a scripting language, and Stylus Studio (an IDE for XSLT and XML).
eXtensible Information Server supports transactions and can
participate in XA transactions. However, it cannot currently manage XA
transactions, so the application must coordinate any XA transactions
that include eXtensible Information Server and other data sources, such
as backend data stores. Other database features include distributed
caching, partitioning, online backup and restore, and clustering
support.
Finally, eXtensible Information Server comes with Java, COM,
and .NET APIs, a JCA-compliant driver, a built-in XSLT processor, and a
set of GUI development tools. These include an XML editor, an XSLT
editor, a schema editor (XML Schemas and DTDs), an XSLT/Java debugger,
an XML-to-XML mapping tool, and tools for mapping backend data to XML
documents.
Tamino
Developer: Software AG
URL: http://www.softwareag.com/Corporate/products/wm/tamino/default.asp
License: Commercial
Database type: Proprietary. Relational through ODBC.
Entry last updated: November, 2002
Tamino XML Server is a suite of products built in three layers --
core services, enabling services, and solutions (third-party
applications) -- which may be purchased in a variety of combinations.
Core services include a native XML database, an integrated relational
database, schema services, security, administration tools, and Tamino
X-Tension, a service that allows users to write extensions that
customize server functionality.
The XML engine uses the Data Map, which describes where the
data in a given XML document is stored. This allows individual XML
documents to be composed of data from multiple, heterogeneous sources,
such as the native XML data store, relational databases, and the file
system. Since the connections to external data (made through the X-Node
module) are live and bidirectional, Tamino may thus be used to perform
heterogeneous joins and updates.
Tamino's XML support includes the DOM, JDOM, SAX, and XML:DB
APIs, an extended XPath implementation called X-Query (not to be
confused with W3C XQuery, which it predates), full-text retrieval,
processing of XML documents with server-side XSL and CSS, and limited
support for SOAP. It can store schema-less documents and can use schema
information (including a subset of XML Schemas) if it is available.
The internal SQL engine is directly addressable through ODBC,
JDBC, and OLE DB. However, when addressed via these APIs, it cannot
integrate data from the internal XML data store or from external data
sources. (As noted above, the reverse is true. That is, with the help
of the X-Node, the XML engine can integrate data from the XML data
store and other databases, including the internal SQL engine.)
Enabling services include X-Port, X-Plorer, X-Application,
various APIs (mentioned above), X-Node (also mentioned above), and the
WebDAV Server. X-Port provides URL-based data transfer through various
standard HTTP servers, X-Plorer is a browser-based navigation tool for
documents stored in Tamino, and X-Application is a set of JSP tags for
accessing Tamino through Web pages.
The WebDAV Server adds namespace management (nested
collections or directories), additional properties (such as
last-modified, content length or content type) and overwrite protection
(persistent locking) to the existing Tamino XML Server functionality.
This allows Tamino to serve as a virtual file system (Web folder) where
the information can be stored and retrieved using a standard Web
browser and the common drag and drop metaphor.
(Note: In spite of rumors to the contrary, Tamino is not built
on top of Adabas, a hierarchical database from Software AG. Instead,
the Tamino data store was built from the ground up as a native XML
database, obviously drawing on the knowledge gained from developing
Adabas.)
TeraText DBS (formerly SIM (Structured Information Manager))
Developer: TeraText Solutions (A Division of SAIC)
URL: http://www.teratext.com/get/page/browser/browser?category=Products/TeraText%20DBS,
http://www.saic.com/products/software/teratext/
License: Commercial
Database type: Proprietary
Entry last updated: August, 2002
From the Web site:
"TeraText DBS was designed specifically to store, retrieve and
manipulate structured text. ... [It] also indexes all or part of the
document using XML standards, enabling complex and comprehensive
searching."
"[TeraText DBS is] designed to support XML, SGML, Unicode,
Z39.50, HTTP and other industry standards, [and its] components are
modular. They can be installed as a suite or as individual modules to
work with existing database management and document-authoring systems."
"A content server enables searches on structural elements or
document characteristics ... [It] also supports the ... worldwide
industry standard protocol for information retrieval, Z39.50."
"A unique applications server provides immediate access to any
TeraText database. TeraText DBS supports plug and play modules for
complex value added Web services."
"Java , C++ and SOAP APIs as well as WebDav, LDAP, Microsoft Word, PDF and other plug-in adapters are available."
TEXTML Server
Developer: IXIASOFT, Inc.
URL: http://www.ixiasoft.com/default.asp?xml=/xmldocs/webpages/textml-server.xml
License: Commercial
Database type: Proprietary (Document-based)
Entry last updated: June, 2005
TEXTML Server is a native XML database that stores, indexes, and
retrieves whole XML documents. A TEXTML Server installation consists of
one or more document bases, each of which consists of a document
repository and a set of indexes. The document repository is organized
as a hierarchical set of collections and can store both XML and non-XML
documents. All documents are stored intact. The major difference
between XML and non-XML documents is that XML documents are parsed at
insert time to create indexes. While non-XML documents are not parsed,
they can be associated with an XML document that provides indexable
metadata for the non-XML document.
Unlike most native XML databases, the indexes in TEXTML Server
effectively form an additional schema layer on top of the documents
stored in the database. This is because indexes are defined using one
or more XPath expressions. Since these can refer to any document in the
database, the effect is that a single index can refer to more than one
field. For example, an author index might refer to the AuthorName
element in one set of documents and the StoryAuthor attribute in
another set of documents. Furthermore, because indexes are defined
using XPath expressions, it is possible to transform values and index
the transformed values. TEXTML Server supports five different types of
indexes: word (token), string, numeric, date, and time.
TEXTML Server has its own, XML-based query language. Queries
are defined as a series of boolean tests over specific indexes or the
full text of the documents. Tests are generally for equality. In
addition, numeric, date, and time indexes support range tests, and word
and string indexes support wild-card tests. Tests can then be joined
with a number of operators, including And, Or, And Not, Near,
adjacency, and frequency. Queries return whole documents and can sort
results based on index values, document properties, and hit counts.
In addition to being able to associate XML documents with
non-XML documents, TEXTML Server also has a Universal Converter that
can convert more than 225 file formats (word processor, spreadsheet,
presentation, drawing, bitmap, and so on) to XML. This uses Stellent's Outside In XML Export
and extracts document "contents, presentation information, and
metadata". Extracted information is stored in a document that uses the
SearchML schema, also defined by Stellent. Converted documents can then
be searched directly or associated with the original documents as
indexing documents.
Other features of TEXTML Server include check-in/check-out,
versioning, support for plug-ins that are run at insert time, and COM,
Java, .NET, WebDAV, and OLE DB APIs. Security can be specified at the
document, collection, or document-base level. System features include
fault tolerance, replication, load management, and automated recovery.
TigerLogic XML Data Management Server (XDMS)
Developer: Raining Data
URL: http://www.rainingdata.com/products/tl/abouttl.html
License: Commercial
Database type: Pick
Entry last updated: January, 2003
TigerLogic XML Data Management Server (XDMS) is a database designed
to store multiple kinds of data, including "structured, XML, and
unstructured information". (Examples of the latter are office
documents, email, and graphics.) Data is stored in the TigerLogic
Native XML Data Store, which "leverages the Pick Universal Data Model".
As XML documents are inserted into the database, an XML Profiler reads
the incoming documents and gathers information to build indexes. These
are used by the query processor, which supports XPath. TigerLogic XDMS
also supports XSLT.
TigerLogic XDMS has a Java API and is also accessible over
SOAP, HTTP, and JCA. It supports both DTDs and XML Schemas. Of
interest, it supports XA transactions, and provides "on-line backup and
recovery".
Timber
Developer: University of Michigan
URL: http://www.eecs.umich.edu/db/timber
License: Open Source (for non-commercial users)
Database type: Shore, Berkeley DB
Entry last updated: October, 2005
Timber is a native XML database that has an architecture "as close
as possible to that of a relational database," in order to "reuse,
where appropriate, the technologies developed for relational databases
over the past several decades". The basis of Timber is "an XML algebra
that manipulates sets of ordered, labeled trees". The primary
difficulties of such an algebra include the "complex and variable
structure of trees in a set, and issues of ordering."
By default, Timber uses Shore as its underlying data store. It can also use Berkeley DB. It supports a number of different types of indexes, including element, attribute, text, inverted, parent, and join indexes.
Timber supports a subset of XQuery. Users can enter queries
either as XQuery expressions or as logical or physical query plans
using Timber's logical or physical plan syntax. The latter allows
advanced users to optimize queries by hand, as well as to perform some
operations not supported through XQuery. Timber extends XQuery with
functions for deleting nodes or their contents, updating the contents
of a node, and inserting elements or attributes. In addition, Timber
has a command line option for appending the contents of an XML document
to a document already in the database.
Timber has command line, GUI, SOAP, and Web interfaces for performing both queries and administrative functions.
TOTAL XML (formerly Socrates XML)
Developer: Cincom
URL: http://tiger.cincom.com/pages/aboutTotalXML.html
License: Commercial
Database type: Object-relational, external relational through ODBC
Entry last updated: July, 2003
TOTAL XML is a native XML database that can store documents as
objects or text. It can store data in its own object-relational data
store, an external relational database, or a combination of the two. It
is therefore possible to distribute the data for a document across
multiple databases. In addition, TOTAL XML can store non-XML data, such
as "standard relational data" and BLOBs.
Unlike other native XML databases, the objects used to store
XML documents are specific to each DTD, but inherit from an object
model that supports the Infoset. Thus, TOTAL XML has characteristics
similar to both native XML databases and XML-enabled relational
databases. Like an XML-enabled relational database, it is possible to
query the data directly with SQL. However, documents cannot be stored
until the user has defined a map from the DTD to the database. (A
utility is available for generating maps for DTD-less documents.) Like
a native XML database, the database stores information about the full
physical structure of a document and it is possible to round-trip
documents.
TOTAL XML supports three different query languages. XML
documents can be queried with XPath or an extended form of SQL, which
can query relational data and BLOBs as well. Text data can be queried
with regular expressions. TOTAL XML also supports the XML:DB API.
When XML documents are stored as DTD-specific objects,
applications can access these objects through the object-oriented
capabilities of JDBC 2.0. The objects can be used directly or converted
to a DOM tree using the previously defined maps. The DOM is lazily
populated, so data is retrieved from the database only when needed.
When documents are stored as text, applications can access them with
JDBC or ODBC and they are returned as text.
TOTAL XML can integrate data from legacy databases (including
VSAM, IMS, IDM, and Adabas) using Striva DETAIL. The integrated data
can be live or a copy stored in TOTAL XML.
TOTAL XML ships with a number of tools, including utilities to generate classes and maps from DTDs and administration tools.
Virtuoso
Developer: OpenLink Software
URL: http://www.openlinksw.com/virtuoso/
License: Commercial
Database type: Proprietary. Relational through ODBC
Entry last updated: November, 2000
Virtuoso is a heterogeneous join engine featuring security,
transactions (including two-phase commit), and replication. Its query
engine supports heterogeneous views, stored procedures, scrollable
cursors, and full-text search. It accesses external data sources
through ODBC, as well as having its own relational data store.
Virtuoso supports XML in a number of ways. First, it contains
a native XML data store, which is non-relational and can store and
index XML documents in parsed or unparsed form. Second, it can transfer
data from relational databases to XML documents (although not the other
direction), using the same mapping found in the FOR XML clause in Microsoft SQL Server.
Third, it includes an implementation of XPath. Although this only works
on "native" XML data, relational data can be included by first
transferring it to XML. Finally, it includes support for XSLT,
executing stored procedures through SOAP, and WebDAV.
[February, 2002] Virtuoso has a demo implementation of XQuery
that runs over its database. Of interest, this can query virtual
documents, such as those built at run time from a relational database.
XDBM
Developer: Matthew Parry, Paul Sokolovsky
URL: http://sourceforge.net/projects/xdbm/
License: Open Source
Database type: Proprietary (Node-based)
Entry last updated: November, 2000
XDBM uses an interface that is "based upon the DOM standard". It stores
XML documents in a pre-parsed, indexed format and resolves memory
problems by leaving parts of the document on disk until they are needed.
XDB
Developer: ZVON.org
URL: http://zvon.org/index.php?nav_id=61
License: Open Source
Database type: Relational (PostgreSQL only?)
Entry last updated: November, 2001
A native XML database built on a relational database. (It is not clear
if databases other than PostgreSQL are supported.) The database stores
data in proprietary set of tables and includes a partial implementation
of XPath. Written in C++.
XediX TeraSolution
Developer: AM2 Systems
URL: http://www.am2systems.com/technologies-EN.html
License: Commercial
Database type: Proprietary
Entry last updated: June, 2005
XediX TeraSolution is a native XML database built on a proprietary
data store. Users can specify which elements and attributes to index,
and searches are performed with a proprietary language that "permits
addressing in the XML tree in accordance with XPath expressions".
Search results can further be refined through the use of regular
expressions.
XediX TeraSolution can also store non-XML documents through
the use of external entities. Apparently, an XML document provides
metadata for one or more non-XML documents and references them through
external entities. The non-XML documents are stored alongside the
metadata document and are also indexed and searched via the XML
metadata document.
Security is provided through the use of users and groups, and
can be applied at any level of granularity within XML documents. This
allows administrators to assign access rights such that specific users
can view only parts of a given XML document.
X-Hive/DB
Developer: X-Hive Corporation
URL: http://www.x-hive.com/products/db/index.html
License: Commercial
Database type: Proprietary
Entry last updated: May, 2005
X-Hive/DB is a native XML database that includes support for
XQuery, XPath, XML Schemas, DOM Level 3, XSLT, and XSL-FO, as well
transactions, user- and group-level access control, JAAS (Java
Authentication and Authorization Service), replication, load balancing
across multiple servers, and BLOB storage. Additional features include:
o Indexes. X-Hive/DB supports element name, value, full-text
indexes, and custom, as well as "library, ID attribute, and
context-conditioned" indexes. Full-text indexes use a proprietary
indexing mechanism; these indexes can be searched from XQuery through
the xhive:fts (full-text search) function. In addition, users can
integrate their own full-text index engines. Custom indexes are based
on a user-implemented DOM NodeFilter.
o Linking. A link engine that implements XLink and XPointer supports bi-directional links, link-bases, and link management.
o External data. The JDBC Bridge can retrieve a snapshot of
relational data through JDBC. The data is converted to XML using a
table model and can be integrated into other documents.
o WebDAV. Remote clients can directly access collections and documents in the database through WebDAV.
o SOAP. Applications can store and retrieve documents, execute XQuery queries, retrieve XML schemas, and so on through SOAP.
o Custom JSP tags. A tag library for calling X-Hive/DB through Java Server Pages.
o J2EE Resource Adapater. An implementation of J2EE Resource
Adapter allows X-Hive/DB applications to use the transaction management
facilities of an EJB application server.
o Versioning. Both linear and branched versioning (multiple versions of the same document) are supported.
In addition, an implementation of XUpdate (from the XML:DB Initiative) that uses Lexus may be downloaded from the X-Hive Web site.
Xindice (see also dbXML)
Developer: Apache Software Foundation
URL: http://xml.apache.org/xindice/
License: Open Source
Database type: Proprietary (Node-based)
Entry last updated: June, 2005
Xindice is a native XML database written in Java that is designed
to store large numbers of small XML documents, as well as non-XML
documents. It can index element and attribute values and compresses
documents to save space. Documents are arranged into a hierarchy of
collections and can be queried with XPath. (Collection names can be
used as part the XPath query syntax, meaning it is possible to perform
XPath queries across documents.) For updates, Xindice supports the
XUpdate language from the XML:DB Initiative. Finally, Xindice comes
with an experimental linking language that is similar to XLinks, and
allows users to replace or insert content in an XML document at query
time.
Xindice supports three APIs: the XML:DB API
(also from the XML:DB Initiative), a CORBA API, and an XML-RPC plugin
which supports access from languages such as PHP, Perl, and
Applescript. In addition, Xindice provides XMLObjects, which allows
users to extend the server functionality.
Xindice comes with a set of command line tools for using and administering the database, as well as complete documentation.
XML Transactional DOM
Developer: Ontonet
URL: http://ontonet.com/XML_Product.html
License: Commercial
Database type: Object-oriented (R1 Enterprise)
Entry last updated: August, 2003
XML Transactional DOM is a native XML database built on top of
Ontonet's R1 Enterprise object-oriented database. XML documents are
stored in the database as Infoset objects. They may be created by
passing an existing XML document to the database, which is then parsed
and used to create Infoset objects, or directly through a DOM tree.
The XML Transactional DOM implements DOM Level 2 on top of the
Infoset objects. It lazily instantiates nodes and uses Java Soft
Reference Caching to allow the Java Garbage Collector to collect nodes
when the JVM runs out of memory. (Collected nodes that are still in use
are reinstantiated from the database as needed.) The XML Transactional
DOM also implements XQuery over the same Infoset objects, with query
results returned as DOM nodes.
Transactions are supported by an additional interface on the
Document object. This has a method to return a DOMTransaction object,
which implements JTA (Java Transaction API) transactions as well as
savepoints. Savepoints are implemented using the same methods as are
used in JDBC. They can be nested to arbitrary depth.
Additional features of the XML Transactional DOM include
URI-addressable document collections, a JAXP implementation, the
ability to store XML documents in their original form (such as for
legal reasons), and serialization of DOM trees or fragments to Java
OutputStreams or Writers.
XpSQL
Developer: Makoto Yui
URL: http://gborg.postgresql.org/project/xpsql/projdisplay.php
License: Open Source
Database type: Relational (PostgreSQL)
Entry last updated: March, 2004
XpSQL is a native XML database built on top of PostgreSQL. It stores
documents by decomposing them into fragments and storing these
fragments in a set of predefined tables. XpSQL has a command line
utility for loading XML documents into the database, as well as
PostgreSQL functions for retrieving document fragments by node ID. It
also has PostgreSQL functions that implement DOM Level 2 and XPath.
There are two main XPath functions. XPath2SQL converts an
XPath query into an SQL query over the tables used to store XML
documents. The SQL query can then be used to "execute" the XPath query.
Results from the SQL query are returned as XML(?). The XPath_Eval
function accepts an XPath query and returns rows containing two
columns: document ID and node ID. In other words, it returns a list of
nodes that satisfy a given XPath query. XPath_Eval is typically used in
a FROM clause. For example, the following query uses XPath_Eval to
retrieve the value of Price elements. (xml_node is the table used by
XpSQL to store individual nodes.)
SELECT xml_node.value
FROM xml_node, XPath_Eval('/Books/Book/Price') AS price_nodes
WHERE xml_node.id = price_nodes.id
XQuantum XML Database Server
Developer: Cognetic Systems
URL: http://www.cogneticsystems.com/server.html
License: Commercial
Database type: Proprietary
Entry last updated: June, 2006
XQuantum XML Database Server is a native XML database built on a
proprietary data store. It supports a subset of XQuery, a subset of the
XQuery full-text specification, and XSLT.
XQuantum optimizes queries with a cost-based algorithm, which
uses statistics about the data to optimize the search process. The
query processor also relies on "recursive XML indexing" (a schemaless
indexing method), lazy query evaluation, and stream processing of
queries.
XQuantum supports static typing through its own typing
mechanism, which "generalizes XQuery's sequence type syntax to include
full regular expression types" and is used instead of XML Schemas.
Types (effectively schemas for individual XML documents) can be
declared in the prolog of an XQuery query or in external type modules.
They are applied in the query through explicit validation and are used
to provide type information to the query processor.
XQuantum includes a Web server, which allows it to use HTTP as
its API. That is, queries are embedded in URLs and results are returned
as an XML stream. Queries can also be placed in XQuery Server Pages.
These are preferrable for URLs exposed to the public, as they are more
secure (the query is not exposed to the public) and less fragile (the
query can be changed without changing the URL).
XQuantum is also available as the XQuantum XML Database Appliance, a dedicated server running Linux and XQuantum.
XStreamDB Native XML Database
Developer: Bluestream Database Software Corp.
URL: http://www.bluestream.com/products/xstreamdb32
License: Commercial
Database type: Proprietary (Node-based)
Entry last updated: May, 2003
From the company:
"Bluestream XStreamDB(tm) version 3.0 is a native XML database,
built in pure Java with XQuery, full text search, Java API, and support
for schemas, DTDs, and binary and other non-XML datatypes. XStreamDB is
accessible using a JDBC-like Java API, the XStreamDB Explorer GUI
application, scripter, or using WebDAV to reach documents exposed as
URIs. Security is enforced
using MD5 message digest authentication and a user permissions scheme."
"XML documents are stored in a compressed object
representation, using Bluestream's Streamstore database storage engine
(also available separately). The database has a full transactions
architecture that meets the four ACID requirements: Atomic, Consistent,
Isolated, and Durable. Transaction support includes read, write, and
update locks, as well as deadlock detection and victim selection.
Commits and rollbacks are supported so that the system can recover in
the case of a crash."
"It supports multiple, concurrent sessions, as well as session
pooling, and recycles free space automatically, so compaction is not
required. In addition, it allows partial document updates and document
fragment insertion."
"Documents are stored in 'roots' in 'databases' on the
XStreamDB server. A root is equivalent to a collection. Schemas or DTDs
can be loaded and stored in a collection of 'schemas', and users are
kept in a collection of 'users'. Access permissions can be assigned on
documents with the
built-in user permission scheme. XStreamDB stores both XML and binary
document types, with associated mimetypes."
"Collections of XML documents in document roots can be forced
to be schema valid by attaching a schema to the root. XStreamDB
supports both W3C XML Schemas and DTDs. The XStreamDB resource manager
can assign resource information to documents to expose them as URI
unique identifiers (Universal Resource Identifier) through WebDAV, or
the Resources API. Databases and roots are exposed as 'categories', and
documents are exposed as 'resources' within those categories. The
resource manager supports mimetypes, created sub-categories, locking,
and naming. Resources can also be checked out and checked in to the
file system by users."
"XStreamDB supports the XQuery query language for XML data,
and has extended it to support insert, update, and full text searching
capabilites."
"XStreamDB supports both value indexes and full text indexing.
XML document roots with value indexes, will index on the value of data
in a specified element or attribute. Full text indexes store a complete
index of all content in all documents in the root."
"XQuery queries with full text expressions will finds text
within XML document content using wildcard matching, word proximity,
and phrase matching. Results are matched to the element or attribute in
matching documents, and can be automatically marked."
A note about the history of XStreamDB, also from the company:
"XStreamDB was introduced by Bluestream Database Software Corp.
in the spring of 2000. Soon after its introduction, Bluestream was
acquired by XML Global and its XML database product renamed renamed
GoXML DB. In September 2002, XML Global spun off the XML database
division, reinstating the original company and product names.
Bluestream XStreamDB version 3.0 is built
by Bluestream and marketed by XML Global and other authorized
resellers."
Xyleme Zone Server
Developer: Xyleme SA
URL: http://www.xyleme.com/xml_server
License: Commercial
Database type: Proprietary (Natix)
Entry last updated: July, 2002
Xyleme Zone Server is a native XML database that uses Natix
as its engine. It supports XQuery and indexes documents at run time as
they are added to the database. Xyleme Zone Server can run in clusters
and can distribute queries across multiple machines. Local applications
can access the server directly from C++ or Java, and remote
applications can access it with SOAP. Security is provided on a
per-document basis and the product ships with a set of administration
tools.
Users can categorize documents according to their semantic
type -- financial statements, product documentation, legal documents,
etc. Each category is defined by an "abstract view", which is mapped to
the schema of each class of documents in the category. This allows
users to query all documents in a category by querying the view, rather
than having to each class of documents separately. The query processor
translates the query against the view into queries against each schema
and returns results that correspond to the view.
Users can also subscribe to a service that notifies them of
changes to documents. Individual subscriptions are defined as queries,
using a proprietary language that (apparently) extends XQuery.
Subscription queries run at specified individuals and applications
check the output of these queries to determine what has changed.
Of interest, Xyleme SA provides an online repository of Web
pages. This may be queried across the Web, presumably as part of
queries that also query local data.