Shotgun optical mapping: A comprehensive statistical and computational analysis
by Valouev, Anton, Ph.D., UNIVERSITY OF SOUTHERN CALIFORNIA, 2006, 124 pages; 3238339

Abstract:

Shotgun Optical Mapping is a whole-genome high-throughput restriction mapping technology in which restriction maps of single DNA molecules are collected using high-magnification digital microscopy. Optical Mapping has a wide spectrum of genomic applications and thus is important subject for analysis. This thesis concerns statistical and computational aspects of Optical Mapping data. Specifically, we address optical map alignments, whole-genome de novo assembly of optical maps, and application to analysis of genomic differences.

We start by statistical modelling of Optical Mapping measurements, and validate that our models provide accurate fit to real data. The measurement distributions are then used to derive a probabilistic alignment score which we use to calculate optical-to-optical map alignments and optical-to-reference map alignments. The advantage of our approach is that it guarantees the maximal discrimination between the spurious and true alignments and also does not require ad hoc choice for the scoring parameters.

Then, we present an efficient method for the whole-genome assembly of optical maps that allows to produce accurate restriction maps of the relevant genomes in feasible time. Our assembly method follows Overlap-Layout-Consensus approach that was demonstrated to be very effective in sequence assembly problems. We also present a special error correction method that we use to eliminate spurious overlaps and chimeric maps. Application of our assembler to several optical map datasets demonstrates that it is capable to handle mammalian-sized genomes and yield accurate restriction maps provided sufficient genomic coverage.

We also demonstrate how Optical Mapping data can be used for identification of certain class of differences between genomes, specifically, insertions and deletions exceeding 5000 base pairs, and restriction fragment length polymorphisms. We develop statistical framework for analysis of these differences based on hypothesis testing approach and demonstrate how the differences can be assessed statistically.

 
AdviserMichael Waterman
SchoolUNIVERSITY OF SOUTHERN CALIFORNIA
SourceDAI/B 67-10, p. , Jan 2007
Source TypeDissertation
SubjectsGenetics; Mathematics
Publication Number3238339
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3238339
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.