Automatic ontology extraction and applications
by Guo, Hui, Ph.D., STATE UNIVERSITY OF NEW YORK AT STONY BROOK, 2007, 118 pages; 3301487

Abstract:

At the moment, there is a great mismatch between the amount of data available in electronic (machine-readable) form and the amount of data that can actually be processed or used by computers. This is because most information is stored as text, which is easy for humans to process but too unstructured for use by computer algorithms.

In computer science, ontologies represent knowledge in different domains, and are designed for the purpose of knowledge sharing and reuse. Ontologies enable computers to reason about knowledge and do many tasks automatically. However, it is time-consuming to build ontologies by hand. Also, different tasks may require different ontologies.

In this thesis, I describe two approaches to automatic ontology extraction: one that operates on structured Web pages and one that operates on short text segments. I present example ontologies automatically extracted from real data, and describe and evaluate several applications of these ontologies: Web page content labeling, data extraction, multimedia generation and web service matching. I show that automatic ontology extraction is feasible and can be used to quickly acquire information about new domains.

 
Advisor
SchoolSTATE UNIVERSITY OF NEW YORK AT STONY BROOK
SourceDAI/B 69-02, p. , May 2008
Source TypeDissertation
SubjectsComputer science
Publication Number3301487
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3301487
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.