Currently Teaching

AI-Sys LLM Edition Graduate Seminar (294-162) FA23.

Course Website

Description

The recent success of AI has been at least partly driven by advances in hardware and software systems. These systems have enabled training increasingly complex models on ever larger datasets. In the process, these systems have also simplified model development, enabling the rapid growth in the machine learning community. These new hardware and software systems include a new generation of GPUs and hardware accelerators (e.g., TPU) as well as open source frameworks such as TensorFlow and PyTorch (and many others) and have shaped AI research and practice.

A fundamental hypothesis in AI-Systems research is that advances in systems will enable the continued scaling of models and data to unlock new AI capabilities. In the past few years, we have started to see significant evidence of this hypothesis with very large models unlock new AI capabilities especially in the context of text and image generation.

In this course, we will study the latest trends in systems designs to better support the next generation of AI applications. We will focus on advances in generative AI and specifically LLMs and how they have mirrored advances in computer systems for AI. We will cover the key pieces of work from the system and AI literature that have driven these advances and may reveal where research is headed next.


Previously Classes

Data 8: Foundations of Data Science

Course Website

Description

The UC Berkeley Foundations of Data Science course combines three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. It delves into social issues surrounding data analysis such as privacy and design.

AI-Sys Graduate Seminar (294-162) SP22.

Course Website

Description

The recent success of AI has been in large part due in part to advances in hardware and software systems. These systems have enabled training increasingly complex models on ever larger datasets. In the process, these systems have also simplified model development, enabling the rapid growth in the machine learning community. These new hardware and software systems include a new generation of GPUs and hardware accelerators (e.g., TPU and Nervana), open source frameworks such as Theano, TensorFlow, PyTorch, MXNet, Apache Spark, Clipper, Horovod, and Ray, and a myriad of systems deployed internally at companies just to name a few. At the same time, we are witnessing a flurry of ML/RL applications to improve hardware and system designs, job scheduling, program synthesis, and circuit layouts.

In this course, we will describe the latest trends in systems designs to better support the next generation of AI applications, and applications of AI to optimize the architecture and the performance of systems. The format of this course will be a mix of lectures, in class paper review discussions, and student presentations. Students will be responsible for paper readings, and completing a hands-on project. For projects, we will strongly encourage teams that contains both AI and systems students.

Conversations with Thought Leaders in Technology (CS198-100).

This interactive seminar class will connect a select group of Berkeley students with thought leaders at major technology companies and venture capital firms. Each week will be organized around a short set of readings related to the speaker’s background followed by a live zoom session with that speaker. Students will participate in an interactive slack discussion and generate questions to ask each speaker. Students will learn about how new technology is funded and developed as well as trends in the industry. They will also have the opportunity to meet face-to-face with CXOs, VC partners, and top research scientists from around the world and learn about their career paths. This is also an amazing opportunity to network with leaders in the technology world.

Data Science 100

I co-created this large intermediate data science class. This class now serves over 1200 students a semester.

Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate level class bridges between Data8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making.​ Through a strong emphasizes on data centric computing, quantitative critical thinking, and exploratory data analysis this class covers key principles and techniques of data science. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.

  1. Data Science 100 (SP20)
  2. Data Science 100 (SP18)
  3. Data Science 100 (FA17)
  4. Data Science 100 (Sp17)

AI-Sys Graduate Seminar (294-162) FA 2019.

Course Website

Description

The recent success of AI has been in large part due in part to advances in hardware and software systems. These systems have enabled training increasingly complex models on ever larger datasets. In the process, these systems have also simplified model development, enabling the rapid growth in the machine learning community. These new hardware and software systems include a new generation of GPUs and hardware accelerators (e.g., TPU and Nervana), open source frameworks such as Theano, TensorFlow, PyTorch, MXNet, Apache Spark, Clipper, Horovod, and Ray, and a myriad of systems deployed internally at companies just to name a few. At the same time, we are witnessing a flurry of ML/RL applications to improve hardware and system designs, job scheduling, program synthesis, and circuit layouts.

In this course, we will describe the latest trends in systems designs to better support the next generation of AI applications, and applications of AI to optimize the architecture and the performance of systems. The format of this course will be a mix of lectures, in class paper review discussions, and student presentations. Students will be responsible for paper readings, and completing a hands-on project. For projects, we will strongly encourage teams that contains both AI and systems students.


Previous Graduate Seminars

When time is available I teach graduate seminars in machine learning and systems.

AI-Sys Graduate Seminar (294-159) Spring 2019.

This was the first of the AI-Sys graduate seminars co-taught with Ion Stoica. Here we focused on interesting recent papers in AI and Systems.

CS294-20: RISE Lab Class

This seminar aims to serve as a catalyst for research in the RISE lab, a new lab following the AMPLab. We will read and discuss papers on the state-of-the-art of learning systems (large-scale model training, deep learning, real-time robust inference), big data systems (scale-out vs scale-up, scalable data analytics), and systems security (computation on encrypted data, secure hardware enclaves, language-based mechanisms).


Introduction to Databases

Access methods and file systems to facilitate data access. Hierarchical, network, relational, and object-oriented data models. Query languages for models. Embedding query languages in programming languages. Database services including protection, integrity control, and alternative views of data. High-level interfaces including application generators, browsers, and report writers. Introduction to transaction processing. Database system implementation to be done as term project.

  1. Introduction to Databases (CS186/286) co-taught with Joe Hellerstein.