Dissertation Talk: Towards Practical Privacy-Preserving Data Analytics

Presentation: Dissertation Talk: CS: Data Science | December 12 | 2-3 p.m. | 380 Soda Hall

 Noah M. Johnson

 Electrical Engineering and Computer Sciences (EECS)

Organizations are increasingly collecting sensitive information about individuals. Extracting value from this data requires providing analysts with flexible access, typically in the form of databases that support SQL queries. Unfortunately, allowing access to data has been a major cause of privacy breaches.

In this talk I describe empirical and theoretical advances towards addressing this problem using differential privacy, a technique that provides strong privacy guarantees for individuals while allowing general statistical analysis of data. I begin with a study using 8.1 million real-world queries to determine the requirements for practical differential privacy, and identify limitations of previous approachs in light of these requirements. I then propose a novel method for differential privacy that addresses key limitations of previous approaches.

Next I present Chorus, an open-source system that automatically enforces differential privacy for statistical SQL queries. Chorus is the first system for differential privacy that is compatible with real databases, supports queries expressed in standard SQL, and integrates easily into existing data environments. Chorus is currently deployed at a large technology company for internal analytics and GDPR compliance and in this capacity processes more than 10,000 queries per day.