Proactive experiment-driven learning for system management
by Shivam, Piyush, Ph.D., DUKE UNIVERSITY, 2007, 190 pages; 3284097

Abstract:

The overall behavior of a system depends on a large number of factors related to the underlying hardware, system software, and running applications. In addition, system behavior may be influenced by interactions among these factors, where the impact of an individual factor on a system depends on the settings of other factors. A 'system knowledge base' that captures how different factors and multifactor interactions affect the end-to-end behavior of a system is a prerequisite for managing systems effectively. This dissertation addresses the hypothesis that we can learn such a knowledge base in an automatic, proactive, and timely manner by planning and conducting experiments.

An experiment is a run of the system for a specific setting of the system's workload, resource allocation, and configuration. In this dissertation, we develop a general experiment-driven framework that incorporates: (a) policies for automatic planning of experiments to explore a large space of factors and interactions efficiently; and (b) mechanisms to conduct experiments for three important system domains: Web services, batch computing, and storage servers. The policies and mechanisms leverage techniques from design of experiments, active machine learning, and system virtualization to build a sufficiently accurate system knowledge base quickly. The dissertation makes the following contributions: (1) Quantifies the linear and nonlinear impact of a factor or an interaction on system behavior, and develops experiment-planning algorithms to estimate the impact of important factors and interactions in a system. We use this work to rank the factors and interactions that can affect the performance (e.g., throughput) of multitier Web services. (2) Develops experiment-planning algorithms to build models that predict the system behavior as a function of factors and interactions that affect this behavior. We explore a continuum of modeling alternatives ranging from a priori models to black-box models. We learn models to enable task and data placement of batch computing applications, and to predict performance measures of Web services like response time and throughput. (3) Develops policies to determine how long to run an experiment and how many times to repeat an experiment to attain target levels of confidence and accuracy in experimental results at low cost. We use the policies to benchmark storage servers by systematically mapping a storage server's saturation throughput across a range of server workloads and configurations.

Our empirical evaluation with real and synthetic applications on physical as well as virtual hardware resources shows that our experiment-driven framework can learn an effective knowledge base by conducting only 1-5% of the total number of possible experiments.

 
AdvisersJeffrey S. Chase; Shivnath Babu
SchoolDUKE UNIVERSITY
SourceDAI/B 68-11, p. , Feb 2008
Source TypeDissertation
SubjectsComputer science
Publication Number3284097
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3284097
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.