CS 186. Introduction to Databases
Catalog Description: (4 units) A hands-on introduction to Database Systems: their internal architecture, algorithms and data structures, mathematical underpinnings, and use. Topics include: (a) Engine Technology for Large Data Sets: disk layout, indexing, search and query processing algorithms, query optimization, transactional concurrency control, logging and recovery; (b) Data Models and Languages: The Relational model of data, formal relational languages (relational algebra and calculus), and the SQL language. Text and web search models and implementation, including Boolean search and ranking. Extensions to the relational model including object-relational features, XML and associated query languages; (c) Database Design: Entity-Relationship modeling, logical relational schema design, physical design, functional dependencies and normalization, and database tuning; (d) Database application development: Application-level database APIs including host-languages embeddings library interfaces like JDBC, and web script interfaces like PHP. Work for the course includes team-oriented programming projects based on extensions to the PostgreSQL open-source database system.
Prerequisites: CS 61B and CS 61C.
- Fundamental understanding of the theory and practice of data representation, query languages, and their interrelationships. Exposure to leading data and query models including the Relational model, document and web search, and XML.
- Practical experience with implementation of disk-oriented system internals, with a focus on the efficient management of disk and memory. The ability to benchmark these implementations and determine their efficacy.
- Theoretical understanding of transactional consistency semantics, and detailed concurrency and recover protocols to implement these semantics.
- Practical experience in the design and implementation of multi-tier data-intensive applications, integrating Internet-based interfaces, application services, and backend databases.
- System Fundamentals
- Buffer management
- Core query processing techniques
- Join and matching algorithms
- Data models and semantics
- The Entity-Relationship Model
- Relational Model
- Documents and vector spaces
- Query languages and semantics
- Relational Algebra and Calculus
- Keyword and ranked search
- Document and Web Search
- Boolean Text Search
- Ranked Search
- Transactions and data consistency
- Theory of Transactions
- Concurrency Control
- Recovery: The ARIES protocol
- Theory and practice of Database design
- Functional Dependencies
- Physical Database Design and Tuning
- Forms of parallelism
- Parallelizing query operators
- Encapsulating parallelism: the Exchange operator
- Implications for indexing, database design and concurrency control