Statistical monitoring and cluster detection under naturally occurring heterogeneous dichotomous events
by Taseli, Aysun, Ph.D., NORTHEASTERN UNIVERSITY, 2011, 155 pages; 3443837

Abstract:

Many processes produce a count statistic that is a sum of multiple non-homogeneous dichotomous random variables, that is, with different values of the Bernoulli parameter p. The probability distribution of this count statistic is the convolution of J non-identical binomial distributions and can significantly differ from its binomial and normal counterparts. In such cases the homogeneity assumption can result in incorrect probability calculations and conclusions from statistical procedures such as control charts, sequential probability ratio tests, and cluster detection via scan statistics. Use of the exact (J-binomial) distribution, however, can require prohibitively exhausting calculations as the number (J) of non-identical binomial random variables in the convolution increases.

Following the above motivations, this dissertation has three foci: The first is testing and monitoring heterogeneous processes over time. Risk-adjusted sequential probability ratio tests (SPRTs) and resetting SPRT charts are derived, their accuracy and detection performances (average run lengths and operating characteristic curves) are compared to those assuming homogeneity, and shown to be significantly better in some applications.

The second focus area is detection of geographical clusters via scan statistics in the presence of natural heterogeneity. Two risk-adjusted models of Kulldorff's Bernoulli scan statistic, based on the product of risk-adjusted probabilities (J-Bernoulli model) and the distribution of heterogeneity (J-binomial model) are developed and their comparative performance versus the conventional method is explored.

Monte Carlo performance analyses show that the risk-adjusted models lead to better inferences, detection times, and probabilities over a variety of scenarios provide insights for the selection and use of correct methodologies under the occurrence of heterogeneous dichotomous events.

The third problem addresses computation issues of J-binomial distributions. Computing these probabilities is important in many applications, especially since the above mentioned methods each require tens to thousands of J-binomial probability calculations. The accuracy of J-binomial probability estimations via a cumulant based expansion that use orthogonal polynomials and saddle point approximations is explored by comparison to both exact and Monte Carlo estimations (MCE) of probabilities. A normalized Gram-Charlier expansion (NGCE) and saddle point approximations are shown to produce the most accurate results and to be more time-efficient than computing the exact probabilities or the MCE. The NGCE algorithm is practical, known to produce an estimate under all scenarios, and of great value to analysts since it easily can be integrated into computer codes.

 
AdviserJames Benneyan
SchoolNORTHEASTERN UNIVERSITY
SourceDAI/B 72-04, p. , Mar 2011
Source TypeDissertation
SubjectsStatistics; Industrial engineering
Publication Number3443837
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3443837
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.