Learning from local image regions
by Dollar, Piotr, Ph.D., UNIVERSITY OF CALIFORNIA, SAN DIEGO, 2007, 89 pages; 3274503

Abstract:

A trend in computer vision over the last decade or so has been to describe the statistics and content of images in terms of local image regions, i.e., image patches. Applications have included object detection, scene recognition, texture classification and image categorization. Local patch based representations have the advantage that they are robust to global transformations, occlusion, clutter, object and image variation, and so on, while retaining rich information about image content. This is the case even when global information relating the relative position of patches is not used, as in so called "bags of words" approaches. Furthermore, in the supervised learning framework where labeled images are a source of data, characterizing images using patches means a single image can provide a large number of patches for training. These properties suggest local patch based representations should continue to find expanded use in computer vision.

In this dissertation we show the application of patch based methods to three domains for which traditionally more global approaches have been used. First we show how the classic problem of edge detection can be posed as a series of patch by patch decisions that can be solved in a supervised learning framework. We show the application of this approach to a number of specific domains such as mouse boundary detection and road detection. Second, we show how modeling object warps and highly non-linear image transformations can again be done locally, thus avoiding computational challenges and the scarcity of data typically associated with these problems. For example, our approach is able to learn eye motion and out-of-plane rotation of a teacup from sparse data. Third, we extend the notion of local regions from 2D to 3D, i.e. from patches to cuboids, in order to model the content of video. We show applications to behavior recognition in a number of domains including human activity and mouse behavior.

The methods we introduce here advance the state of the art and have the potential to be useful in a broad range of applications in computer vision. Our approach to edge detection currently outperforms all competing approaches for gray scale edge detection and comes in close second for color edge detection on the well established Berkeley Segmentation Dataset. We hope it will play a similar role as Canny edge detection but for highly textured, real world images. Our approach to modeling object warps locally showed dramatic improvements over previous such methods, and helped solidify the theoretical foundation of nonlinear manifold learning. Finally, our cuboids formalism is simple yet powerful, and has already been utilized in two vision systems. It has the potential to serve as the basis for a broad range of methods for describing the contents of video. Overall, our contribution has been to help establish the importance of patch based approaches and to expand our understanding of a fundamental aspect of computer vision.

 
AdviserSerge Belongie
SchoolUNIVERSITY OF CALIFORNIA, SAN DIEGO
SourceDAI/B 68-06, p. , Oct 2007
Source TypeDissertation
SubjectsComputer science
Publication Number3274503
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3274503
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.