Robust partial least squares for regression and classification
by Turkmen, Asuman Seda, Ph.D., AUBURN UNIVERSITY, 2008, 133 pages; 3333162

Abstract:

Partial Least Squares (PLS) is a class of methods for modeling relations between sets of observed variables by means of latent variables where the explanatory variables are highly collinear and where they outnumber the observations. In general, PLS methods aim to derive orthogonal components using the cross-covariance matrix between the response variable(s) and the explanatory variables, a quantity that is known to be affected by unusual observations (outliers) in the data set. In this study, robustified versions of PLS methods, for regression and classification, are introduced.

For regression with quantitative response, a robust PLS regression method (RoPLS), based on weights calculated by BACON or PCOUT algorithm, is proposed. A robust criteria is suggested to determine the optimal number of PLS components which is an important issue in building a PLS regression model. In addition, diagnostic plots are constructed to visualize and classify outliers. Robustness of the proposed method, RoPLS, is studied in detail. Influence function for the RoPLS estimator is derived for low dimensional data and empirical robustness properties are provided for high dimensional data.

PLS was originally designed for regression problems with quantitative response, however, it is also used as a classification technique where the response variable is qualitative. Although several robust PLS methods have been proposed for regression problems, to our knowledge, there has been no study on the robustness of the PLS classification methods. In this study, the effect of outliers on existing PLS classification methods is investigated and a new robust PLS algorithm (RoCPLS) for classification is introduced.

The performances of the proposed methods, RoPLS and RoCPLS, are being assessed by employing several benchmark data sets and extensive simulation experiments.

 
AdviserNedret Billor
SchoolAUBURN UNIVERSITY
SourceDAI/B 69-10, Dec 2008
Source TypeDissertation
SubjectsStatistics
Publication Number3333162
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3333162
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.