Dissertation Talk: Local and Adaptive Image-to-Image Learning and Inference

Seminar: Dissertation Talk: CS | May 15 | 1:30-2:30 p.m. | Sutardja Dai Hall, Newton Room/730

 Evan Shelhamer, UC Berkeley

 Electrical Engineering and Computer Sciences (EECS)

Much of the recent progress on visual recognition has been driven by deep learning and its bicameral heart of composition and end-to-end optimization. Its diffusion however was neither instantaneous nor effortless. To advance across the frontiers of vision, deep learning had to be equipped with the right structures: the true, intrinsic structures of the visual world.

In this talk, I will focus on incorporating locality and scale structure into end-to-end learning to address image-to-image tasks that take image inputs and return image outputs, and examine how dynamic inference to adapt model computation can help cope with the variability of these rich prediction problems. I will look at these directions through the lens of local recognition tasks that require inference of what and where.

Fully convolutional networks decompose image-to-image learning and inference into local scopes. Factorizing these scopes into structured Gaussian and free-form parts, and learning both, optimizes their size and shape to control the degree of locality. Dynamic inference equips our fully convolutional networks with adaptivity to more fully engage with the vastness and variety of vision.

Advisor: Trevor Darrell

 Faculty, Students - Graduate

 All Audiences

 shelhamer@cs.berkeley.edu