English

Sample-efficient Cross-Entropy Method for Real-time Planning

Machine Learning 2020-08-17 v1 Robotics Machine Learning

Abstract

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Keywords

Cite

@article{arxiv.2008.06389,
  title  = {Sample-efficient Cross-Entropy Method for Real-time Planning},
  author = {Cristina Pinneri and Shambhuraj Sawant and Sebastian Blaes and Jan Achterhold and Joerg Stueckler and Michal Rolinek and Georg Martius},
  journal= {arXiv preprint arXiv:2008.06389},
  year   = {2020}
}