Matrix Computations and Scientific Computing Seminar: Sparse inverse covariance matrix estimation for a million variables

Seminar | March 8 | 11 a.m.-12 p.m. | 380 Soda Hall

 Inderjit Dhillon, University of Texas at Austin

 Department of Mathematics

The L1-regularized Gaussian maximum likelihood estimator has been shown to have strong statistical guarantees in recovering a sparse inverse covariance matrix even under high-dimensional settings. However, it requires solving a difficult non-smooth log-determinant program with number of parameters that scale quadratically with the number of Gaussian variables. Earlier methods thus do not scale to problems with more than 20,000 variables. In this talk, I will describe a quadratic approximation method that can solve 1-million dimensional L1-regularized log determinant problems (which would thus have a trillion parameters) on a single computer. In order to do so, we carefully exploit the underlying structure of the problem. Our innovations include (i) a second-order Newton-like method, (ii) division of the variables into free and fixed sets, (iii) a block co-ordinate descent method, and (iv) a memory efficient scheme that approximates key quantities within the algorithm. Even with the latter approximations, the proposed BIGQUIC algorithm can achieve a quadratic convergence rate. Experimental results using synthetic and real application data demonstrate the improvements in performance over previous state-of-the-art methods.

This is joint work with Cho-Jui Hsieh, Matyas Sustik and Pradeep Ravikumar.