Adaptive control of diffusion processes with a discounted reward criterion
The optimal control problem we are dealing with in this paper is to determine control policies that maximize a discounted reward criterion when the dynamic system evolves as a stochastic differential equation (SDE). Both the instantaneous reward function and the SDE’s drift coefficient may depend on an unknown parameter. We give conditions ensuring the existence of an asymptotically optimal policy using the so-called Principle of Estimation and Control. We illustrate our results with several examples.