English

Representation-Driven Reinforcement Learning

Machine Learning 2026-01-23 v3 Artificial Intelligence

Abstract

We present a representation-driven framework for reinforcement learning. By representing policies as estimates of their expected values, we leverage techniques from contextual bandits to guide exploration and exploitation. Particularly, embedding a policy network into a linear feature space allows us to reframe the exploration-exploitation problem as a representation-exploitation problem, where good policy representations enable optimal exploration. We demonstrate the effectiveness of this framework through its application to evolutionary and policy gradient-based approaches, leading to significantly improved performance compared to traditional methods. Our framework provides a new perspective on reinforcement learning, highlighting the importance of policy representation in determining optimal exploration-exploitation strategies.

Keywords

Cite

@article{arxiv.2305.19922,
  title  = {Representation-Driven Reinforcement Learning},
  author = {Ofir Nabati and Guy Tennenholtz and Shie Mannor},
  journal= {arXiv preprint arXiv:2305.19922},
  year   = {2026}
}

Comments

Accepted to ICML 2023

R2 v1 2026-06-28T10:52:08.135Z