XML Indexing & Storage System
HOME PEOPLE PAPERS XISS XISS/R DOWNLOAD LINKS
 
Introduction
Numbering
Algorithms
Data Set
Experiments
Introduction
Schemas
Architecture
Query
Data Set

This research is being sponsored by National Science Foundation CAREER Award IIS-9876037 and Research Infrastructure program EIA-0080123


XISS

With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not
meet the processing requirements under heavy access requests. In this project, we propose a new system for indexing and storing XML data based on a numbering scheme (Extended Preorder Numbering Scheme) for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely,

(1) EE-Join for searching paths from an element to another,

(2) EA-Join for scanning sorted elements and attributes to find element-attribute pairs, and

(3) KC-Join for finding Kleene-Closure on repeated paths or elements.

The EE-Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an order of magnitude faster than conventional approaches.