Skip to main content.
Advanced search >
<< Back to previous page Print

<< Thursday, November 08, 2012 >>


Remind me

Tell a friend

Add to my Google calendar (bCal)

Download to my calendar

Bookmark and ShareShare


Learning patterns in Big data from small data using core-sets

Seminar | November 8 | 12-1 p.m. | 254 Sutardja Dai Hall


Dan Feldman, Massachusetts Institute of Technology

Institute of Transportation Studies


When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. In recent years we have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster on a streaming input, using a manageable amount of memory, and in parallel (say, using Hadoop, cloud service, or GPUs).

A core-set for a given problem is a semantic compression of its input, in the sense that a solution for the problem with the (small) core-set as input yields an approximate solution to the problem with the original (Big) data. In this talk I will describe the core-set approach and recent algorithmic achievements for computing core-sets with performance guarantees. I will also describe applications of this magical new paradigm in Machine Learning, Robotics, Computer Vision, and Privacy. Finally, I will describe in detail iDiary: a system that turns large sensor signals collected from smart-phones into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., “Where did I buy books?”) and receive textual answers based on their GPS signals.


a.vij@berkeley.edu