Change detection with supervised learning
by Hu, Jing, Ph.D., ARIZONA STATE UNIVERSITY, 2007, 94 pages; 3270590

Abstract:

High-dimensional sensor data are routinely collected in modern processes to detect a change and signal when an anomaly occurs. Applications include manufacturing process control, computer network traffic monitoring and credit card default analysis. This research considers methods to detect a change in real time for such systems.

Multivariate Statistical Process Control (SPC) techniques, developed based on multivariate normal distribution, are widely applied. However, new challenges come with large amounts of observations, high dimensionality, non-normal distribution, and with both continuous and categorical variables. These challenges require distribution-free and computational intensive approaches.

This research transforms the problem to supervised learning tasks. This research first considers detecting changes under specific faults in a multivariate process. Specific faults refer to when the means of one or more variables shift in particular directions. Artificial data is generated to compare to in-control data by a supervised learning algorithm, which provides a control boundary for monitoring in real time. The power of detection provided by this approach closely matches that of Hotelling's T 2 statistic for a mean shift without direction and is better for a change with specific faults.

When a large collection of variables are monitored, not all contribute to the anomaly detected. It is important to investigate an out-of-control signal to determine which variables contribute the most to the signal in order to facilitate the corrective actions to signal. A metric measuring contributing variables is proposed based on the concept of density, which is estimated through an artificial contrast using supervised learners. This metric is shown to be equivalent to the metric based on Hotelling's T 2 decomposition for normally distributed data.

Finally, to detect small magnitude changes in a process, a detection procedure is proposed guided by a generalized likelihood ratio test. A sliding window is applied to the on-going data streams and is compared with in-control reference data using supervised learning approaches. The estimated likelihood ratios for the observations in a window are accumulated. The supervised learner estimates the probability ratio directly to avoid the limitations with density estimation. Control limits are decided by either normal approximation or Markov Chain model.

 
Advisor
SchoolARIZONA STATE UNIVERSITY
SourceDAI/B 68-06, p. , Oct 2007
Source TypeDissertation
SubjectsIndustrial engineering
Publication Number3270590
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3270590
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.