JEDNOSTKA NAUKOWA KATEGORII A+

On nearly selfoptimizing strategies for multiarmed bandit problems with controlled arms

Tom 23 / 1996

Ewa Drabik Applicationes Mathematicae 23 (1996), 449-473 DOI: 10.4064/am-23-4-449-473

Streszczenie

Two kinds of strategies for a multiarmed Markov bandit problem with controlled arms are considered: a strategy with forcing and a strategy with randomization. The choice of arm and control function in both cases is based on the current value of the average cost per unit time functional. Some simulation results are also presented.

Autorzy

  • Ewa Drabik

Przeszukaj wydawnictwa IMPAN

Zbyt krótkie zapytanie. Wpisz co najmniej 4 znaki.

Przepisz kod z obrazka

Odśwież obrazek

Odśwież obrazek