English

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Machine Learning 2024-06-26 v1 Artificial Intelligence Machine Learning

Abstract

Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant improvement than previous model-free algorithms. We develop algorithms for uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first model-free DR-RL approach featuring finite sample complexity for total variation and Chi-square divergence uncertainty sets, while also offering an improved sample complexity and broader applicability compared to existing model-free DR-RL algorithms for the KL divergence model. The complexities of our method establish the tightest results for all three uncertainty models in model-free DR-RL, underscoring the effectiveness and efficiency of our algorithm, and highlighting its potential for practical applications.

Keywords

Cite

@article{arxiv.2406.17096,
  title  = {Model-Free Robust Reinforcement Learning with Sample Complexity Analysis},
  author = {Yudan Wang and Shaofeng Zou and Yue Wang},
  journal= {arXiv preprint arXiv:2406.17096},
  year   = {2024}
}

Comments

UAI 2024