Distributed algorithms based on fictitious play for near optimal sequential decision making
by Sisikoglu, Esra, Ph.D., UNIVERSITY OF MICHIGAN, 2009, 84 pages; 3392845

Abstract:

We develop stochastic search algorithms to find optimal or close to optimal solutions for sequential decision making problems. We specifically consider two problem classes:

1. Large-scale, discrete, deterministic, finite horizon dynamic programming problems: We use a Sampled Fictitious Play (SFP) algorithm for solving large-scale, finite horizon, discrete dynamic programming (DP) problems. We model the DP problem as an identical interest game between multiple players. We show that the SFP algorithm converges to the equilibrium strategies of tins game. In addition, we present two new algorithms, namely Repeated SFP and SFP Based Local Search, that find globally optimal solutions using SFP as a base algorithm. We present the performance of the algorithms on dynamic lot sizing problems and the Traveling Salesman Problem (TSP). Numerical experiments show that our algorithms find close to optimal solutions very quickly. We also present small modifications that improve the performance of the algorithms.

2. Stochastic, discounted, infinite horizon Markov Decision Problems: Using Sampled Fictitious Play (SFP) concepts, we develop an online learning algorithm, referred to as SFP based Learning (SFPL), for solving a discounted homogeneous Markov Decision Problem (MDP) where the transition probabilities are unknown. In SFPL, we estimate and update the unknown transition probabilities, the optimal value, and the optimal action of each state, simultaneously. We prove the convergence of SFPL to the optimal solution. We compare the performance of SFPL with SARSA and Q-Learning on dynamic location and windy gridworld problems.

 
AdvisersMarina A. Epelman; Robert L. Smith
SchoolUNIVERSITY OF MICHIGAN
SourceDAI/B 71-01, p. , Apr 2010
Source TypeDissertation
SubjectsIndustrial engineering; Operations research
Publication Number3392845
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3392845
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.