Bin Yu — Three principles of data science: predictability, stability, and computability

Seminar | August 28 | 3:30-5 p.m. | 3108 Etcheverry Hall

 Bin Yu, Professor, UC Berkeley Departments of Statistics and EECS

 Industrial Engineering & Operations Research

In this talk, I'd like to discuss the intertwining importance and connections of three principles of data science in the title in data-driven decisions. Making prediction as its central task and embracing computation as its core, machine learning has enabled wide-ranging data-driven successes. Prediction is a useful way to check with reality. Good prediction implicitly assumes stability between past and future. Stability (relative to data and model perturbations) is also a minimum requirement for interpretability and reproducibility of data driven results (cf. Yu, 2013). It is closely related to uncertainty assessment. Obviously, both prediction and stability principles can not be employed without feasible computational algorithms, hence the importance of computability.

The three principles will be demonstrated in the context of two neuroscience projects and through analytical connections. In particular, the first project adds stability to predictive modeling used for reconstruction of movies from fMRI brain signlas to gain interpretability of the predictive model. The second project uses predictive transfer learning that combines AlexNet, GoogleNet and VGG with single V4 neuron data for state-of-the-art prediction performance. It provides stable function characterization of neurons via (manifold) deep dream images from the predictive models in the difficult primate visual cortex V4. Our V4 results lend support, to a certain extent, to the resemblance of these CNNs to a primate brain.

Bin Yu
Departments of Statistics and EECS, UC Berkeley
statistics.berkeley.edu/~binyu

 510-642-6222