Skip to Main content Skip to Navigation
New interface
Conference poster

Tabular and Deep Learning of Whittle Index

Abstract : - Whittle index policy is an asymptotically optimal heuristic for solving Restless Multi-Armed Bandit Problems (RMBAP). - We propose two algorithms, QWI and QWINN, for the computation of such indices. - Both employ a two timescale system for the computation of the indices and the Q-values of each state/action.
Complete list of metadata

https://hal-univ-pau.archives-ouvertes.fr/hal-03810695
Contributor : Francisco Robledo Connect in order to contact the contributor
Submitted on : Tuesday, October 11, 2022 - 3:40:00 PM
Last modification on : Wednesday, November 9, 2022 - 9:58:05 AM

File

poster.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03810695, version 1

Citation

Francisco Robledo, Urtzi Ayesta, Konstantin Avrachenkov, Vivek S Borkar. Tabular and Deep Learning of Whittle Index. EWRL 2022 - 15th European Workshop on Reinforcement Learning, Sep 2022, Milan, Italy. ⟨hal-03810695⟩

Share

Metrics

Record views

0

Files downloads

0