EECS Joint Colloquium Distinguished Lecture Series

Wednesday, October 30, 2002
Hewlett Packard Auditorium, 306 Soda Hall
4:00-5:00 p.m.

Professor Jennifer Widom

Departments of Computer Science and Electrical Engineering, Stanford University


Old Systems for New Data: Querying XML and Data Streams




Over the past 30 years database researchers and practitioners have settled into a fairly well-defined and accepted architecture and suite of techniques for database management, which rely on a number of basic assumptions about database applications and about the data itself. Dropping any of these assumptions renders many standard techniques inapplicable or very inefficient. In recent work we have explored dropping two assumptions, one at a time.

First we dropped the assumption that each database has a fixed, well-structured schema (type structure) defined in advance. Dropping this assumption led to the study of "semistructured data," which predated but bears a remarkable resemblance to XML. We developed a prototype database management system for semistructured data (and later XML). I will provide a brief overview of the system, then will focus on how our query processor coped with the absence of schema and lack of data regularity.

In current work we are dropping the assumption that data comes as relatively static, bounded-size data sets. Instead we consider data in the form of continuous, unbounded, possibly rapid "data streams." Once again we are developing a prototype database management system, this time for data streams. I will discuss two of our research challenges: languages for continuous queries over data streams, and exploiting data constraints to minimize resource overhead while maximizing query result precision.


Jennifer Widom received her Bachelors degree from the Indiana University School of Music in 1982 and her Computer Science Ph.D. from Cornell University in 1987. From 1987-88 she was a Visiting Assistant Professor in the Computer Science Department at Cornell. Before joining the Stanford faculty in 1993, she spent five years as a Research Staff Member at the IBM Almaden Research Center. She has coauthored three books, is a former Guggenheim fellow, and has served on various program committees, advisory boards, and editorial boards.