Robust gene expression measure using databases of microarrays
by Sui, Yunxia, Ph.D., BROWN UNIVERSITY, 2010, 125 pages; 3430120

Abstract:

DNA Microarrays have become an indispensable technique in biomedical research. The raw measurements from microarrays undergo a number of preprocessing steps before the data are converted to the genomic level for further analysis.

Background adjustment is an important step in preprocessing. Estimating background noise has been challenging because background levels vary a lot from probe to probe, yet there are limited observations on each probe. Most current methods have used the empirical Bayes approach to borrow information across probes on the same array. These approaches shrink the background estimate for either the entire sample or probes sharing similar sequence structures. In the first part of this thesis, we present a solution that is truly probe specific by using a database of large number of microarray experiments. Information is borrowed across samples and background noise is estimated for each probe individually. The ability to obtain probe specific background distributions allows us to extend the dynamic range of gene expression levels. An R package dbRMA implementing this method is available.

Since the gene expression measure is not directly observed, it is an estimate, and the second part of the thesis presents an approach to estimate the standard error of each gene expression. The dependent relationship between variance and the gene expression level is incorporated into the standard error estimation via robust nonparametric techniques. This novel method shows striking improvement in the ability of detecting differentially expressed genes compared to the most popular methods.

Most microarray studies provide only a relative measurement of gene expression. This measure cannot be used to compare different genes in the same sample as the probe effect varies from probe to probe. The small amount of calibration data limits the power of sequence-only models to predict the probe efficiency. The last part of this thesis demonstrates that, by taking advantage of a large database of experiments in a combination of sequence models, the probe efficiency can be estimated with smaller variance. Gene expression measures adjusted for probe efficiency allow the comparison between genes, as well as the comparison of the same gene measured on different platforms.

 
AdviserZhijin (Jean) Wu
SchoolBROWN UNIVERSITY
SourceDAI/B 71-11, p. , Nov 2010
Source TypeDissertation
SubjectsBiostatistics; Bioinformatics
Publication Number3430120
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3430120
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.