Skip to content

Latest commit

 

History

History
50 lines (50 loc) · 1.81 KB

2023-04-11-bhatt23b.md

File metadata and controls

50 lines (50 loc) · 1.81 KB
title abstract section layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Piecewise Stationary Bandits under Risk Criteria
Piecewise stationary stochastic multi-armed bandits have been extensively explored in the risk-neutral and sub-Gaussian setting. In this work, we consider a multi-armed bandit framework in which the reward distributions are heavy-tailed and non-stationary, and evaluate the performance of algorithms using general risk criteria. Specifically, we make the following contributions: (i) We first propose a non-parametric change detection algorithm that can detect general distributional changes in heavy-tailed distributions. (ii)We then propose a truncation-based UCB-type bandit algorithm integrating the above regime change detection algorithm to minimize the regret of the non-stationary learning problem. (iii) Finally, we establish the regret bounds for the proposed bandit algorithm by characterizing the statistical properties of the general change detection algorithm, along with a novel regret analysis.
Regular Papers
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
bhatt23b
0
Piecewise Stationary Bandits under Risk Criteria
4313
4335
4313-4335
4313
false
Bhatt, Sujay and Fang, Guanhua and Li, Ping
given family
Sujay
Bhatt
given family
Guanhua
Fang
given family
Ping
Li
2023-04-11
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
206
inproceedings
date-parts
2023
4
11