Introduction
1 Review of Continuous Time Models
1.1 Martingales and Martingale Inequalities
1.2 Stochastic Integration
1.3 Stochastic Differential Equations: Diffusions
1.4 Reflected Diffusions
1.5 Processes with Jumps
2 Controlled Markov Chains
2.1 Recursive Equations for the Cost
2.2 Optimal Stopping Problems
2.3 Discounted Cost
2.4 Control to a Target Set and Contraction Mappings
2.5 Finite Time Control Problems
3 Dynamic Programming Equations
3.1 Functionals of Uncontrolled Processes
3.2 The Optimal Stopping Problem
3.3 Control Until a Target Set Is Reached
3.4 A Discounted Problem with a Target Set and Reflection
3.5 Average Cost Per Unit Time
4 Markov Chain Approximation Method: Introduction
4.1 Markov Chain Approximation
4.2 Continuous Time Interpolation
4.3 A Markov Chain Interpolation
4.4 A Random Walk Approximation
4.5 A Deterministic Discounted Problem
4.6 Deterministic Relaxed Controls
5 Construction of the Approximating Markov Chains
5.1 One Dimensional Examples
5.2 Numerical Simplifications
5.3 The General Finite Difference Method
5.4 A Direct Construction
5.5 Variable Grids
5.6 Jump Diffusion Processes
5.7 Reflecting Boundaries
5.8 Dynamic Programming Equations
5.9 Controlled and State Dependent Variance
6 Computational Methods for Controlled Markov Chains
6.1 The Problem Formulation
6.2 Classical Iterative Methods
6.3 Error Bounds
6.4 Accelerated Jacobi and Gauss-Seidel Methods
6.5 Domain Decomposition
6.6 Coarse Grid-Fine Grid Solutions
6.7 A Multigrid Method
6.8 Linear Programming
7 The Ergodic Cost Problem: Formulation and Algorithms
7.1 Formulation of the Control Problem
7.2 A Jacobi Type Iteration
7.3 Approximation in Policy Space
7.4 Numerical Methods
7.5 The Control Problem
7.6 The Interpolated Process
7.7 Computations
7.8 Boundary Costs and Controls
8 Heavy Traffic and Singular Control
8.1 Motivating Examples
&nb
The basic stochastic approximation algorithms introduced by Robbins and MonroandbyKieferandWolfowitzintheearly1950shavebeenthesubject of an enormous literature, both theoretical and applied. This is due to the large number of applications and the interesting theoretical issues in the analysis of ¿dynamically de?ned¿ stochastic processes. The basic paradigm is a stochastic di?erence equation such as ? = ? + Y , where ? takes n+1 n n n n its values in some Euclidean space, Y is a random variable, and the ¿step n size¿ > 0 is small and might go to zero as n??. In its simplest form, n ? is a parameter of a system, and the random vector Y is a function of n ¿noise-corrupted¿ observations taken on the system when the parameter is set to ? . One recursively adjusts the parameter so that some goal is met n asymptotically. Thisbookisconcernedwiththequalitativeandasymptotic properties of such recursive algorithms in the diverse forms in which they arise in applications. There are analogous continuous time algorithms, but the conditions and proofs are generally very close to those for the discrete time case. The original work was motivated by the problem of ?nding a root of a continuous function g ¯(?), where the function is not known but the - perimenter is able to take ¿noisy¿ measurements at any desired value of ?. Recursive methods for root ?nding are common in classical numerical analysis, and it is reasonable to expect that appropriate stochastic analogs would also perform well.