Integrative statistical methods for the analysis of transcriptomic and metabolomic data
by Poisson, Laila M., Ph.D., UNIVERSITY OF MICHIGAN, 2010, 148 pages; 3406407

Abstract:

Cancer research is embracing the multiple "-omics" technologies available for global scale measurement of molecular events. Transcriptomics, as the global measure of gene expression, has been well developed through microarray technology. Metabolomics, an emerging omics field, involves chromatography-coupled mass spectrometry to measure the global activity of metabolites or small molecules. The aim of this dissertation is to integrate the analysis of these two data sources to enhance the ability to find molecular changes between two disease states. This work is motivated by a prostate cancer progression study in which tumor samples of varying stage and benign tissue were assessed for both gene expression and metabolomic levels. Using a pathway-directed approach, transcriptomics and metabolomics data can be mapped using publicly available metabolic pathways. Here we describe three integrative methods.

In the first topic we begin with a classification method that utilizes a differential list of elements from a prior study to make prognostic or diagnostic predictions about samples in a current study. We extend the classification method by providing a testing scenario for the classifier. Though motivated by the integration of in vitro and in vivo gene expression datasets we show that it is applicable across omics platforms as well.

The second topic explores the use of p-value weighting to improve the power of per-metabolite tests of differential intensity. We use gene-set enrichment testing to capture the gene expression information contained in pre-defined pathways. The results of these tests are used to devise pathway-based weights. In this way, metabolites that are involved in a pathway that is dysregulated in its gene expression are given higher importance.

Finally, the third topic extends two univariate set enrichment tests to jointly search for sets of genes and metabolites that are coordinately differential. We compare these methods to their univariate counterparts and to enrichment testing on the concatenated datasets. In almost all scenarios explored, testing the datasets jointly is preferred.

Each of these methods is applied to the motivating metabolomics and matched gene expression datasets and results are discussed.

 
AdviserJeremy M.G. Taylor
SchoolUNIVERSITY OF MICHIGAN
SourceDAI/B 71-05, p. , May 2010
Source TypeDissertation
SubjectsMolecular biology; Biostatistics
Publication Number3406407
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3406407
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.