Computer Vision Beyond Recognition
Seminar | February 6 | 12-1:30 p.m. | 560 Evans Hall
Stella Yu, UC Berkeley
Computer vision has advanced rapidly with deep learning, achieving super-human performance on a few recognition benchmarks. At the core of the state-of-the-art approaches for image classification, object detection, and semantic/instance segmentation is sliding-window classification, engineered for computational efficiency. Such piecemeal analysis of visual perception often has trouble getting details right and fails miserably with occlusion. Human vision, on the other hand, thrives on occlusion, excels at seeing wholes and parts, and can recognize objects with very little supervision. I will describe several works that build upon concepts of perceptual organization, integrate multiscale and figure-ground cues, learn to develop pixel and image relationships in a data-driven fashion, with no annotations at all or with lesser and fewer annotations, in order to deliver more accurate and generalizing performance beyond recognition in a closed world. Our recent works can not only capture apparent visual similarity without perceptual organization priors or any feature engineering, but also provide powerful exploratory data analysis tools that can seamlessly integrate external domain knowledge into a data-driven machine learning framework.
Stella Yu received her Ph.D. from Carnegie Mellon University, where she studied robotics at the Robotics Institute and vision science at the Center for the Neural Basis of Cognition. She continued her computer vision research as a postdoctoral fellow at UC Berkeley, and then studied art and vision as a Clare Booth Luce Professor at Boston College, during which she received an NSF CAREER award. Dr. Yu is currently the Director of Vision Group at the International Computer Science Institute (ICSI) and a Senior Fellow at the Berkeley Institute for Data Science (BIDS) at UC Berkeley. Dr. Yu is interested not only in understanding visual perception from multiple perspectives, but also in using computer vision and machine learning to capture and exceed human expertise in practical applications.