Dissertation talk: Machine Learning---Why Do Simple Algorithms Work So Well?

Lecture: Dissertation Talk: CS | May 8 | 4-5 p.m. | 540AB Cory Hall

 Chi Jin

 Electrical Engineering and Computer Sciences (EECS)

While state-of-the-art machine learning models are deep, large-scale, sequential and highly nonconvex, the backbone of modern learning algorithms are simple algorithms such as stochastic gradient descent, or Q-learning (in the case of reinforcement learning tasks). A basic question endures---why do simple algorithms work so well even in these challenging settings?

This talk focuses on two fundamental problems: (1) in nonconvex optimization, can gradient descent escape saddle points efficiently? (2) in reinforcement learning, is Q-learning sample efficient? We will provide the first line of provably positive answers to both questions. In particular, we will show that simple modifications to these classical algorithms guarantee significantly better properties, which explains the underlying mechanisms behind their favorable performance in practice.

 chijin@berkeley.edu, 510-387-1599