Events & News
Colloquium
Category | Lecture |
Date | Thursday, April 8, 2010 |
Time | 9:00 am |
Location | GS 701 |
Details | Committee members: Kobus Barnard Alon Efrat Clayton Morrison Ian Fasel |
Speaker | Joseph Schlecht |
Title | PhD Final Defense |
Affiliation | Computer Science Department |
Learning 3-D Models of Object Structure From Images
Abstract: Recognizing objects in images is an effortless task for most people. Automating this task with computers, however, presents a difficult challenge attributable to large variations in object appearance, shape, and pose. The problem is further compounded by ambiguity from projecting 3-D objects into a 2-D image. In this thesis we present an approach to resolve these issues by modeling object structure with a collection of connected 3-D geometric primitives and a separate model for the camera. From sets of images we simultaneously learn a generative, statistical model for the object representation and parameters of the imaging system. By learning 3-D structure models we are going beyond recognition towards quantifying object shape and understanding its variation.
We explore our approach in the context of microscopic images of biological structure and single view images of man-made objects composed of block-like
parts, such as furniture. We express detected features from both domains as statistically generated by an image likelihood conditioned on models for the object structure and imaging system. Our representation of biological structure focuses on Alternaria, a genus of fungus comprising ellipsoid and cylinder shaped substructures. In the case of man-made furniture objects, we represent structure with spatially contiguous ssemblages of blocks arbitrarily constructed according to a small set of design constraints.
We learn the models with Bayesian statistical inference over structure and camera parameters per image, and for man-made objects, across categories, such as chairs. We develop a reversible-jump MCMC sampling algorithm to explore topology hypotheses, and a hybrid of Metropolis-Hastings and stochastic dynamics to search within topologies. Our results demonstrate that we can infer both 3-D object and camera parameters simultaneously from images, and that doing so improves understanding of structure in images. We further show how 3-D structure models can be inferred from single view images, and that learned category parameters capture structure variation that is useful for recognition.