UMI  
ProQuest® Dissertations & Theses
The world's most comprehensive collection of dissertations and theses. Learn more...
ProQuest  
 
 
Statistical complications of infectious disease data: Causal inference, multiple testing and machine learning
by Machekano, Rhoderick Neri, PhD, UNIVERSITY OF CALIFORNIA, BERKELEY, 2006, 0 pages; 3228412
 

Abstract: This thesis discusses methods for drawing causal inference in the presence of data pathologies such as missingness and multiple hypotheses, motivated by statistical problems encountered in everyday public health and medical practice. The thesis consists of three parts, addressing problems arising from malaria, human immunodeficiency virus (HIV) and trauma data. The first part of the thesis proposes and compares statistical methods for missing outcomes in malaria efficacy studies. The methods include a class of inverse weighted estimators, the inverse probability of censoring weighted (IPCW) estimator and the doubly robust estimator; G-estimation; multiple imputation; complete case and extreme case approaches. Simulation results demonstrate that the doubly robust estimator, an estimator that combines IPCW and G-computation estimation is consistent when the missingness mechanism or covariate-outcome model is correctly specified. We apply the methods to the estimation of chloroquine plus sulfadoxine-pyrimethamine (CQSP) and ammodiaquine plus sulfadoxine-pyrimethamine (AQSP) effects on malaria infection. The second part of this thesis is motivated by data coming from the study of patterns of changes in HIV gene sequences when infected patients are treated with antiretroviral drugs. The primary objective is to identify codon mutations in HIV genes that are associated with drug resistance. Because these data are high dimensional and dependent, testing associations between codon mutations and drug resistance presents a multiple testing (MT) problem. We propose two MT procedures: an augmented-Bonferroni procedure and an empirical Bayes procedure. The augmented-Bonferroni procedure is a modified Bonferroni procedure where a guessed number of true null hypotheses is used as the denominator to get the adjusted p-value. The empirical Bayes procedure utilizes a mixture model to estimate the number of true null hypotheses and guess the set of true null hypotheses. We compare these new approaches to the traditional MT procedures such as Bonferroni and Holm's procedures. The empirical Bayes procedure is less conservative and is suitable for exploratory analyses such as finding codon mutations associated with drug resistance. The third part of this thesis concerns the problem of selecting subgroups of patients with long bone fractures who may benefit from early treatment compared to late treatment. Given that many baseline characteristics are available at admission, we could like to identify a subset of these characteristics that can guide the choice of treatment. We propose a definition of prognostic importance based on a definition of variable importance introduced by van der Laan [1], and estimate the prognostic importance of each covariate. We propose the quantile function multiple testing procedure [2] as a method for selecting the important covariates, controlling familywise error rate. Application of these procedures to orthopedic surgery trauma data shows that the injury severity score is important in guiding the timing of long bone fracture fixation.

 
Advisor: Jewell, Nicholas P.; Hubbard, Alan E.
School: UNIVERSITY OF CALIFORNIA, BERKELEY
Source: DAI-B 67/08, p. 4190, Feb 2007
Source Type: PhD
Subjects: Biostatistics; Statistics; Epidemiology
Publication Number: 3228412
     
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3228412
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

 
 
 

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.il.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.



Copyright © 2007 ProQuest. All rights reserved. Terms and Conditions

ProQuest