Dissertation talk: Navigating Videos using Structured Text
Lecture: Dissertation Talk: CS | December 13 | 2-3 p.m. | Hearst Memorial Mining Building, 354/360 HMMB (BiD Lab http://bid.berkeley.edu/directions/)
Amy Pavel, UC Berkeley
Video provides a rich and widespread medium for documentation, education, and entertainment. However, the challenge of browsing, skimming and navigating long recorded videos limits the utility of such videos for reference and reuse. Recent advances in computer vision and natural language processing allow surfacing objects, actions, scenes and speech as navigable text. But, the provided text provided from such methods does not necessarily support higher-level human tasks. Instead, the higher level semantic structure of the video (e.g., the outline, summary, or scenes) often remains hidden from the user.
This talk explores how we can combine domain-specific human annotations with automatic techniques to let people navigate videos using structured text documents. I will explore this idea through systems spanning three domains that support different tasks by aligning video to text documents: educational lecture videos, movies, and critique session. The implementation and evaluation of each system provide insights on how we might better facilitate human and machine collaboration.
Advisors: Björn Hartmann and Maneesh Agrawala