NFA-based Filtering for Efficient and Scalable XML Routing

Yanlei Diao, Hao Zhang and Michael J. Franklin

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-01-1159
2001

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2001/CSD-01-1159.pdf

Soon, much of the data exchanged over the Internet will be encoded in XML. This encoding can be used as the basis for sophisticated filtering and content-based routing of data using XML queries. Filtering systems such as XFilter represent XML path expressions as Finite State Machines and index them; In contrast, work on continuous query processing for database systems has focused on grouping common parts of relational-style queries. It is clear that for scalable and efficient XML filtering and routing both types of techniques will be needed. We propose a new approach based on Nondeterministic Finite Automata (NFA) that indexes path expressions and captures the overlap between queries naturally. The approach also has the potential of gracefully handling other problems in this environment such as concurrent filtering and online updates. Preliminary experimental results show that the NFA approach can dramatically improve filtering performance.


BibTeX citation:

@techreport{Diao:CSD-01-1159,
    Author = {Diao, Yanlei and Zhang, Hao and Franklin, Michael J.},
    Title = {NFA-based Filtering for Efficient and Scalable XML Routing},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2001},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2001/6195.html},
    Number = {UCB/CSD-01-1159},
    Abstract = {Soon, much of the data exchanged over the Internet will be encoded in XML. This encoding can be used as the basis for sophisticated filtering and content-based routing of data using XML queries. Filtering systems such as XFilter represent XML path expressions as Finite State Machines and index them; In contrast, work on continuous query processing for database systems has focused on grouping common parts of relational-style queries. It is clear that for scalable and efficient XML filtering and routing both types of techniques will be needed. We propose a new approach based on Nondeterministic Finite Automata (NFA) that indexes path expressions and captures the overlap between queries naturally. The approach also has the potential of gracefully handling other problems in this environment such as concurrent filtering and online updates. Preliminary experimental results show that the NFA approach can dramatically improve filtering performance.}
}

EndNote citation:

%0 Report
%A Diao, Yanlei
%A Zhang, Hao
%A Franklin, Michael J.
%T NFA-based Filtering for Efficient and Scalable XML Routing
%I EECS Department, University of California, Berkeley
%D 2001
%@ UCB/CSD-01-1159
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2001/6195.html
%F Diao:CSD-01-1159