Framework and Algorithms for Extraction of Knowledge: Accelerated Radical Innovation and Spatial Interestingness Hotspots
by Miller, Ruth Huang, Ph.D., UNIVERSITY OF HOUSTON, 2011, 130 pages; 3492368

Abstract:

Data are the basic building block of computing. Extracting knowledge from the abundance of data requires substantial processing. Annotation, mining, and visualization are three transformational processes that convert these data into knowledge. Unstructured, semi-structured, and geo-spatial data has experienced unprecedented growth in volume and on-line availability with the explosion of the Internet. This growth makes it increasingly likely that the precise knowledge the user needs or wants is available somewhere, but makes retrieval, usage, and understanding of these data much more challenging. This dissertation will look at three strategies for transforming data into knowledge. The first strategy is to collect and aggregate data from difference sources into domain specific data warehouse repositories that enables rapid knowledge retrieval and use. This strategy is used when the specific purpose has not been established in advance or the retrieval of this knowledge is time critical. The second strategy is to annotate the retrieved data with XML according to predetermined domain specific ontologies to facilitate querying this knowledge. This strategy is best used for unstructured or semi-structured domain specific documents. The third strategy centers on extracting knowledge from spatially annotated data. In this case, spatial context, particularly location, serves as the glue which ties information together that originates from different knowledge sources. The main contributions of this dissertation are: 1) development of a framework for finding geo-spatial hotspots, 2) development of a geo-feature pre-selection algorithm to automatically search for promising candidates, 3) development of ZIPS, a interestingness hotspot detection algorithm based on polygons, 4) experimental evaluation of the proposed algorithms in case studies involving Internet advertising, housing vacancies, and unemployment, 5) creation of a framework for agent based domain specific data collection supporting the ARI Competitive Intelligence Methodology, and 6) creation of a framework for XML annotation of textual documents based on ontologies for subsequent querying.

 
Advisor
SchoolUNIVERSITY OF HOUSTON
SourceDAI/B 73-04, p. , Jan 2012
Source TypeDissertation
SubjectsComputer science
Publication Number3492368
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3492368
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.