Applied Mathematics Seminar: Uncertainty Quantification in the Classification of High Dimensional Data

Seminar | April 26 | 3:30-4:30 p.m. | 891 Evans Hall

 Andrew Stuart, California Institute of Technology

 Department of Mathematics

We provide a unified framework for graph based semi-supervised learning which brings together a variety of methods which have been introduced in different communities within the mathematical sciences; the unification is through an inverse problems formulation. We study probit classification, generalize the level-set method for Bayesian inverse problems to the classification setting, and generalize the Ginzburg-Landau optimization-based classifier to a Bayesian setting; we also show that the probit and level set approaches are natural relaxations of the harmonic function approach introduced in machine learning.

We introduce efficient numerical methods, suited to large data-sets, for both MCMC-based sampling as well as gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms. Finally we study continuum limits of the problem formulations, and algorithms, arising in the infinite data limit.

Collaboration with AL Bertozzi, X Luo (UCLA), and KC Zygalakis (Edinburgh).