The development and application of a risk index to predict individualized chronic disease risk
by Kelly, Reagan John, Ph.D., UNIVERSITY OF MICHIGAN, 2009, 334 pages; 3392870

Abstract:

Integrating clinical and genetic information to improve clinicians' ability to estimate an individual's disease risk is an important biomedical research challenge. This dissertation develops a "risk index" procedure that combines clinical data and genome-wide genotypes to make predictions about individuals' risk of disease.

For a set of 100 simulated datasets containing 1,000 individuals, 8 clinical covariates, 500 Single nucleotide polymorphisms, and an outcome prevalence of 30%, the average area under the receiver operating characteristics (ROC) curve (AUC) for a risk index model built with clinical covariates and SNPs was significantly higher than a model built with clinical covariates alone (0.846 vs. 0.832, p=0.0002). A risk index model that includes the principal components that account for 90% of the variability in the SNPs also significantly increased the average AUC compared to a clinical covariates only model (0.839 vs. 0.826, p=0.008). For a set of 25 simulated datasets containing 10,000 individuals, 29 clinical covariates, and 38,835 SNPs, a significant difference in average AUC was observed between clinical and clinical+genotype models (0.939 vs. 0.926, p=0.001), using the 500 SNPs most highly associated with the outcome. A risk index model including the 500 largest principal components of the 38,835 SNPs did not significantly increase the mean AUC beyond the clinical model (0.931 vs. 0.931, p=0.98).

The risk index methodology was then applied to individuals from the Framingham Heart Study using 27 clinical covariates and 48,071 SNPs. Clinical+genotype risk index models built to predict ten-year incident hypertension, ten-year incident diabetes, or prevalent hypertension had AUCs of 0.475, 0.682, and 0.692, respectively, using the 500 SNPs most highly associated with the outcome, and AUCs of 0.563, 0.782, and 0.712, respectively, using the 500 largest principal components of the SNPs.

The results from these analyses suggest that the risk index methodology has utility for predicting an individual's risk of developing a chronic disease, and that the use of principal components of a large set of SNPs in place of a smaller selected set of associated SNPs provides the best predictive performance.

 
AdviserSharon R. Kardia
SchoolUNIVERSITY OF MICHIGAN
SourceDAI/B 71-02, p. , Apr 2010
Source TypeDissertation
SubjectsBiostatistics; Public health; Bioinformatics
Publication Number3392870
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3392870
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.