Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

   

Research Areas - Database Management Systems (DBMS)

Overview

Computers are used less for computing per se than for the management, distribution and analysis of information. Database research thus goes to the heart of computing. Berkeley's database group defined the field in the 1970's, via pioneering research projects including INGRES and POSTGRES that spawned a multi-billion dollar relational database industry and a series of influential open source systems. Today, the group continues to redefine the field, taking the foundations of data management into a global environment of live, noisy, networked data sources.

Database group web site: db.cs.berkeley.edu

Topics

  • Declarative Networking

    Database systems have long used "declarative" languages, in which programmers focus on program outcomes (what) rather than implementation (how.) In recent years, our group has demonstrated that recursive declarative languages and runtime engines are an excellent match for building distributed and networked systems. Our declarative networking approach provides radically simplified, efficient implementations of tasks as diverse as distributed query processing, statistical inference, distributed agreement, and core networking protocols. We have demonstrated that declarative programs a few dozen lines long compete with C++ implementations that are tens of thousands of lines long. Our software includes the P2 system for declarative overlay networks on the Internet, and the DSN system for declarative programming of wireless sensor networks.

  • Data Management for Wireless Sensor Networks and RFID

    Berkeley is the multidisciplinary leader in wireless sensor network research. Sensor networks, and related technologies like RFID infrastructures, are by their nature tools for data acquisition and management, and Berkeley's database group has played a key role in this space. We developed the TinyDB sensornet query engine, the first system to provide a high-level language and runtime for tasking entire"clouds" of sensors in a simple way. We designed probabilistic methods for energy-efficient approximation of sensornet queries and distributed triggers, as well as statistical methods to clean noisy data coming from unpredictable RFID readers. The Declarative Sensor Network (DSN) project described above investigates the use of deductive database techniques to programming entire sensornet "stacks", from core networking internals to high-level data management.

  • Probabilistic Data Management

    Several real-world applications need to effectively manage large amounts of data that are inherently uncertain, employing sophisticated probabilistic modeling tools to accurately reason about complex correlation/causality patterns in the data. Example applications include sensor-rich, "smart-home" environments and bioinformatics databases, where noisy, uncertain data is the norm and probabilistic models are used, e.g., to infer user activities or reason about protein molecule structures. We are working to redefine the algorithms and architecture of a DBMS to effectively manage uncertainty and probabilistic reasoning as "first-class citizens" of the system. This includes novel techniques for (a) exposing statistical modeling structures and inference algorithms to key DBMS components (e.g., query engine, query optimizer), and (b) supporting a uniform, declarative means for higher-level applications to store, query, and learn from such probabilistic data.

  • Stream Query Processing

    Traditional data management has assumed a stored repository of information. Recent years have seen a proliferation of streaming data sources, including sensor networks, financial data feeds, and monitors of networks and software services. Stream data management raises a number of new challenges in adaptively processing multiple queries, managing fault tolerance, dealing with archives, and providing approximate answers in overload situations. Berkeley's database group has been a leader in this area, investigating these issues and more in the context of the Telegraph project for adaptive processing of stream queries, and in the YFilter XML message broker.

Research Projects

Faculty

Related Courses

Send requests for updates to researchupdates@eecs, or Login to make changes yourself.