Learning from Censored and Dependent Data
Seminar: EE: CS: Data Science | November 12 | 4-5 p.m. | Banatao Auditorium, 310 Sutardja Dai Hall
Constantinos Daskalakis, Professor, Department of Computer Science, Massachusetts Institute of Technology
Machine Learning is invaluable for extracting insights from large volumes of data. A key assumption enabling many methods, however, is having access to training data comprising independent observations from the entire distribution of relevant data. In practice, data is commonly missing due to measurement limitations, legal restrictions, or data collection and sharing practices. Moreover, observations are commonly collected on a network, a spatial or a temporal domain and may be intricately dependent. Training on data that is censored or dependent is known to lead to Machine Learning models that are biased.
In this talk, we overview recent work on learning from censored and dependent data. We propose a learning framework which is widely applicable, and instantiate this framework to obtain computationally and statistically efficient methods for linear, logistic and probit regression from censored or dependent samples, in high dimensions. We complement these theoretical findings with experiments showing the practicality of the framework in training Deep Neural Network models on biased data. Our findings are enabled through connections to Statistical Physics, Concentration and Anti-concentration of measure, and properties of Stochastic Gradient Descent, and resolve classical challenges in Statistics and Econometrics.
BIO: Constantinos (a.k.a. Costis) Daskalakis is a Professor of Computer Science at MIT and a member of CSAIL. He works on computation theory and its interface with game theory, economics, probability theory, statistics and machine learning. He holds a Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece, and a PhD in Computer Science from UC-Berkeley. He has been honored with the ACM Doctoral Dissertation award, the Kalai Prize from the Game Theory Society, the Sloan Foundation Fellowship, the Microsoft Faculty Fellowship, the SIAM outstanding paper prize, the ACM Grace Murray Hopper Award, the Simons investigator award, the Bodossaki Foundation Distinguished Young Scientists Award, and the Nevanlinna prize from the International Mathematical Union.