Dissertation Talk: Learning Single-view 3D Reconstruction of Objects and Scenes
Miscellaneous | April 6 | 3-4 p.m. | Sutardja Dai Hall, SDH 250
Shubham Tulsiani, UC Berkeley
In this talk, I will discuss the task of inferring 3D structure underlying an image, in particular focusing on two questions - a) how we can plausibly obtain supervisory signal for this task, and b) what forms of representation should we pursue. I will first show that we can leverage image-based supervision to learn single-view 3D prediction, by using geometry as a bridge between the learning systems and the available indirect supervision. We will see that this approach enables learning 3D structure across diverse setups e.g. predicting deformable models or volumetric 3D for objects, or inferring layered-depth images for scenes. I will then advocate the case for inferring interpretable and compositional 3D representations. I will present a method that discovers the coherent compositional structure across objects in a unsupervised manner by attempting to assemble shapes using volumetric primitives, and then demonstrate the advantages of predicting similar factored 3D representations for complex scenes.