Dissertation Talk: Systems-Aware Optimization for Machine Learning at Scale

Seminar | May 4 | 11 a.m.-12 p.m. | 405 Soda Hall

 Virginia Smith, EECS

 Electrical Engineering and Computer Sciences (EECS)

New computing systems have emerged in response to the increasing size and complexity of modern datasets. For best performance, machine learning methods must be designed to closely align with the underlying properties of these systems. In this talk, I illustrate the impact of systems-aware machine learning in the distributed setting, where communication remains the most significant bottleneck. I present a general optimization framework, CoCoA, that uses local computation in a primal-dual setting to allow for a tunable, problem-specific communication scheme. The resulting framework enjoys strong convergence guarantees and exhibits state-of-the-art empirical performance in the distributed setting. I demonstrate this performance with extensive experiments in Apache Spark, achieving speedups of up to 50x compared to leading distributed methods for common machine learning objectives.