Instructor: Ben Recht
Time: TuTh 3:30-5:00 PM
Location: 521 Cory
Hall 3 LeConte Hall
Office Hours: T 2:30-3:30, M 3-4.
Location: 726 Sutardja Dai Hall
Description:
This course will explore theory and algorithms for nonlinear
optimization. We will focus on problems that arise in machine
learning and computational statistics, paying close attention to
concerns about complexity, scaling, and implementation in these
domains. Whenever possible, methods will be linked to particular
application examples in data analysis. Topics will include
Required background: The prerequisites are previous coursework in linear algebra, multivariate calculus, probability and statistics. Some degree of mathematical maturity is also required. Coursework or background in optimization theory as covered in EE227BT is highly recommended. Numerical programming will be required for this course, so familiarity with MATLAB, R, numerical python, or an equivalent will be necessary.
Grading: There will be about four homeworks, which require some basic programming. There will be a take-home midterm and no final. A course project will also be required.
Texts:
Recommended references:
Lectutre 1 (1/21): Introduction and math review. notes, additional reading: NW Chap. 2
Lecture 2 (1/23): The Gradient Method. notes, additional reading: NW Chap. 3, Nest. Chap. 1.2.3.
Lecture 3 (1/28): More on the Gradient Method. notes
Lecture 4 (1/30): Quick review of convexity. notes
Lecture 5 (2/4): The gradient method and convex functions. notes, additional reading: Nest. Chap. 2.1.1, 2.1.5.
Lecture 6 (2/6): Lower bounds for first order methods. notes, additional reading: Nest. Chap 2.1.2, 2.1.2, 2.1.4.
Lecture 7 (2/13): Momentum and the Heavy Ball Method. notes
Lecture 8 (2/18): Nesterov's accelerated method. notes, additional reading: Nest. 2.2.
Lecture 9 (2/20): Newton and quasi-Newton methods. notes, additional reading: Nest. 1.2.4, NW Chap 3.3, Chap 6.
Lecture 10 (2/27): quasi-Newton loose ends: weak Wolfe line search, L-BFGS, and Barzilai-Borwein , additional reading: NW Chap. 7.2.
Lecture 11 (3/4): Hardness of non-convex optimization.
Lecture 12 (3/6): The stochastic gradient method.
Lecture 13 (3/11): Analysis of the stochastic gradient method.
Lecture 14 (3/13): Subgradients.
Lecture 15 (3/18): The subgradient method.
Lecture 16 (3/20): The projected gradient and proximal point methods.
Lecture 17 (4/1): Duality.
Lecture 18 (4/8): Dual decomposition and the augmented Lagrangian.
Lecture 19 (4/10): Mirror Descent.
Lecture 20 (4/15): Dual averaging.
Lecture 21 (4/17): The alternating direction method of multipliers.
Homeworks
Problem Set 1. Due in class on February 13.
Problem Set 2. Due in class on March 4.
Problem Set 3. Adult data set: [mat] [csv]. Due in class on March 20.
Miscellaneous Readings: