Engineers Australia - On effectiveness of the Mirror Decent Algorithm for a stochastic multi-armed bandit governed by a stationary finite Markov chain

2013 3rd Australian Control Conference (AUCC)

Author(s): Alexander Nazin ; Boris Miller
Publisher: Engineers Australia
Publication Date: 1 November 2013
Conference Location: Fremantle, WA, Australia
Conference Date: 4 November 2013
Page(s): 244 - 250
ISBN (CD): 978-1-4799-2497-4
ISBN (Electronic): 978-1-4799-2498-1
DOI: 10.1109/AUCC.2013.6697280

In this article, we study the effectiveness of the Mirror Descent Randomized Control Algorithm recently developed to a class of homogeneous finite Markov chains governed by the stochastic... View More