title

software

abstract

section

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Exploration in Reward Machines with Low Regret

https://github.com/HippolyteBourel/UCRL-RM

We study reinforcement learning (RL) for decision processes with non-Markovian reward, in which high-level knowledge in the form of reward machines is available to the learner. Specifically, we investigate the efficiency of RL under the average-reward criterion, in the regret minimization setting. We propose two model-based RL algorithms that each exploits the structure of the reward machines, and show that our algorithms achieve regret bounds that improve over those of baselines by a multiplicative factor proportional to the number of states in the underlying reward machine. To the best of our knowledge, the proposed algorithms and associated regret bounds are the first to tailor the analysis specifically to reward machines, either in the episodic or average-reward settings. We also present a regret lower bound for the studied setting, which indicates that the proposed algorithms achieve a near-optimal regret. Finally, we report numerical experiments that demonstrate the superiority of the proposed algorithms over existing baselines in practice.

Regular Papers

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

bourel23a

0

Exploration in Reward Machines with Low Regret

4114

4146

4114-4146

4114

false

Bourel, Hippolyte and Jonsson, Anders and Maillard, Odalric-Ambrym and Talebi, Mohammad Sadegh

given	family
Hippolyte	Bourel

given	family
Anders	Jonsson

given	family
Odalric-Ambrym	Maillard

given	family
Mohammad Sadegh	Talebi

2023-04-11

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics

206

inproceedings

date-parts

2023

4

11

https://proceedings.mlr.press/v206/bourel23a/bourel23a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023-04-11-bourel23a.md

2023-04-11-bourel23a.md

Files

2023-04-11-bourel23a.md

Latest commit

History

2023-04-11-bourel23a.md

File metadata and controls