Decoding mammalian gene regulatory programs through efficient microarray, ChIP-chip and sequence analysis
by Ji, Hongkai, Ph.D., HARVARD UNIVERSITY, 2007, 172 pages; 3265178

Abstract:

Knowing how gene regulatory programs are encoded in the genome and executed in living cells is a key to understand human diseases. The goal of the thesis is to explore efficient statistical strategies to dissect mammalian gene regulation.

An empirical hierarchical Bayes approach was proposed to analyze gene expression data collected from microarray experiments. Through a closed-form variance shrinkage estimator, information from multiple genes is pooled to increase the statistical power of multiple hypothesis testing. The approach allows various types of subject matter knowledge to be incorporated conveniently. Caveats in controlling false discovery rate (FDR) will be discussed. When variance shrinkage estimator is employed, inappropriate use of permutations may result in underestimation of FDR.

Based on the hierarchical empirical Bayes approach, a TileMap method was developed for the tiling array data analysis. The method combines the hierarchical model with a Moving Average (MA) method or a Hidden Markov Model (HMM). Unbalanced Mixture Subtraction (UMS) was proposed to provide approximate estimates of false discovery rate for MA and model parameters for HMM. Applying TileMap to ChIP-chip allows one to detect transcription factor binding regions at a 500–2000 base pair resolution level. Systematic evaluations showed that the method significantly increased the performance of protein-DNA binding region detection compared to previously existing methods.

Finally, a comparative analysis involving six human and mouse transcription factors was performed to explore common characteristics of mammalian chromatin immunoprecipitation data and potential issues in their analysis. The cross-study comparisons revealed the importance of matched genomic controls in the de novo identification of 6–30 base pair long transcription factor binding motifs, raised issues about the interpretation of ubiquitously occurring sequence motifs, and demonstrated the clustering tendency of protein binding regions for certain transcription factors.

The methods developed here were applied to dissect gene regulatory programs in mouse Sonic Hedgehog (SHH) signaling pathway. Through a combined analysis of gene expression, ChIP-chip and sequence motifs, new enhancers that are targeted by transcription factor GLI1 were discovered.

 
AdviserWing Hung Wong
SchoolHARVARD UNIVERSITY
SourceDAI/B 68-05, p. , Aug 2007
Source TypeDissertation
SubjectsStatistics; Bioinformatics
Publication Number3265178
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3265178
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.