PADMINI: A peer-to-peer distributed data mining system for astronomy researchers
by Mahule, Tushar Pradeep, M.S., UNIVERSITY OF MARYLAND, BALTIMORE COUNTY, 2010, 112 pages; 1481234

Abstract:

As the amount of data available at geographically distributed sources increases rapidly, the need for efficient distributed data mining is becoming increasingly important. Increasing computation powers (change this) at lower hardware costs and reliable communication mechanisms have also led to the proliferation of Peer-to-Peer networks. These factors have lead to the development of dedicated distributed solutions that can run on Peer-to-Peer networks. Many domains such as finance, astronomy, bioinformatics etc. face varied challenges where such solutions can prove instrumental. This thesis presents PADMINI—a Peer-to-Peer Astronomy Data Mining system. Unlike centralized data mining systems, PADMINI is a Web based system powered by Google Sky and distributed data mining algorithms that run on a collection of computing nodes. PADMINI supports two disparate frameworks, namely Hadoop and Distributed Data Mining Toolkit. These frameworks enable PADMINI to support a wide range of data mining algorithms. This work presents solutions implemented on PADMINI for specific data mining problems like Outlier Detection and Classifier Learning. The PADMINI system can also be used to learn (classifiers) classification models from any source of data over the internet, without requiring any kind of support from the host servers. Experimental results to establish the correctness of the solutions and the scalable nature of the PADMINI system are also provided.

 
AdviserHillol Kargupta
SchoolUNIVERSITY OF MARYLAND, BALTIMORE COUNTY
SourceMAI/ 49-01, p. , Oct 2010
Source TypeThesis
SubjectsComputer science
Publication Number1481234
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:1481234
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.