Seminar | November 6 | 3-4 p.m. | 400 Cory Hall
Yuejie Chi, CMU
There is an increasing need to perform large-scale machine learning and optimization over distributed networks, e.g. in the context of multi-agent learning and federated optimization. It is well recognized that, a careful balance of local computation and global communication is necessary in the distributed setting for empirical risk minimization. In this talk, we first consider a natural framework of distributing popular stochastic variance reduced methods in the master-slave setting, and establish its convergence guarantees under simple and intuitive assumptions that capture the effect of local data heterogeneity. Local regularization will be introduced to ensure convergence when data are highly unbalanced. Next, we move to the decentralized network setting, where each agent only aggregates information from its neighbors over a network topology. We first develop a communication-efficient approximate Newton-type algorithm, which is obtained by adjusting the global gradient estimate via a tracking term for the popular DANE algorithm (Shamir et. al., 2014). The same idea can be applied in a systematic manner, to obtain decentralized versions of other master/slave distributed algorithms, such as those employing variance reductions. Theoretical convergence guarantees and numerical evidence are provided to demonstrate the appealing performance of our algorithms over competitive baselines, in terms of both communication and computation efficiency.