Human-centered semantic retrieval in multimedia databases
by Chen, Xin, Ph.D., THE UNIVERSITY OF ALABAMA AT BIRMINGHAM, 2008, 195 pages; 3316460

Abstract:

The research described in this dissertation has proposed a human-centered retrieval framework that can automatically retrieve multimedia data based on their semantic content. In particular, the framework queries and searches images and videos in a multimedia database according to their visual content. In order for computers to understand the semantic contents of images and videos, human guidance is necessary. By incorporating the user's Relevance Feedback (RF) on the retrieval results into the learning and retrieval mechanism, the semantic gap between humans and computers can be gradually bridged. High-dimensional feature vectors of multimedia data can cause a dramatic increase in computation time. This is known as the "Curse of Dimensionality." To alleviate this problem, clustering algorithms are designed to reduce the search space for retrieval and thus reduce the time complexity. In addition, in order to facilitate the query and retrieval of video data, a multimedia database model is designed according to the spatiotemporal nature of video data.

The proposed framework in this research is composed of three major components–Interactive Content-based Image Retrieval (CBIR), Semantic Video Retrieval, and Spatiotemporal Multimedia Database Model.

The Interactive CBIR component successfully maps the region-based image retrieval problem to a Multiple Instance Learning (MIL) problem. A distance-based clustering algorithm and a semantic-based clustering algorithm are designed to reduce the search space. This component supports both short term and long term learning. The Semantic Video Retrieval component emphasizes the study of spatiotemporal characteristics and relations among semantic objects in videos. Traffic incidents in transportation surveillance videos and abnormal human interactions in indoor surveillance videos are used as case studies. The proposed work designs and implements a semantic event retrieval system for intelligent surveillance systems. The technique of RF plays a key role in the retrieval process. Various spatiotemporal event models and learning mechanisms are designed and tested. In addition, since the application for surveillance video database retrieval is a focus of interest in this research, an efficient conceptual Spatiotemporal Multimedia Database model is designed to facilitate the query of user-interested spatiotemporal events. A case study on the proposed database model is provided using transportation surveillance videos.

In brief, the human-centered multimedia retrieval system proposed in this research focuses on alleviating the above-mentioned two problems–the "Semantic Gap" and the "Curse of Dimensionality." The Interactive Region-based Image Retrieval component and the Semantic Video Retrieval component both explore the use of RF in the learning and retrieval phase to solve the problem of "Semantic Gap." These two components also integrate RF with MIL to ease the burden of users in providing feedback on the retrieval results. To alleviate the "Curse of Dimensionality" problem, semantic clustering algorithms are designed and implemented which consider both the low-level features of multimedia data and the high-level human perceptions. In order to facilitate the query and retrieval, a spatiotemporal multimedia database model is proposed to provide an efficient indexing scheme.

The experimental results for each individual component are presented. Comparisons with related work are conducted, showing the effectiveness of the proposed framework.

 
AdviserChengcui Zhang
SchoolTHE UNIVERSITY OF ALABAMA AT BIRMINGHAM
SourceDAI/B 69-08, p. , Oct 2008
Source TypeDissertation
SubjectsComputer science
Publication Number3316460
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3316460
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.