Related papers: Model-Predictive Control via Cross-Entropy and Gra…

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

Current state-of-the-art model-based reinforcement learning algorithms use trajectory sampling methods, such as the Cross-Entropy Method (CEM), for planning in continuous control settings. These zeroth-order optimizers require sampling a…

Machine Learning · Computer Science 2021-12-16 Kevin Huang , Sahin Lale , Ugo Rosolia , Yuanyuan Shi , Anima Anandkumar

Sample-efficient Cross-Entropy Method for Real-time Planning

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency…

Machine Learning · Computer Science 2020-08-17 Cristina Pinneri , Shambhuraj Sawant , Sebastian Blaes , Jan Achterhold , Joerg Stueckler , Michal Rolinek , Georg Martius

Sample-Efficient and Smooth Cross-Entropy Method Model Predictive Control Using Deterministic Samples

Cross-entropy method model predictive control (CEM--MPC) is a powerful gradient-free technique for nonlinear optimal control, but its performance is often limited by the reliance on random sampling. This conventional approach can lead to…

Systems and Control · Electrical Eng. & Systems 2026-05-12 Markus Walker , Daniel Frisch , Uwe D. Hanebeck

GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction

In this work we present a new method of black-box optimization and constraint satisfaction. Existing algorithms that have attempted to solve this problem are unable to consider multiple modes, and are not able to adapt to changes in…

Machine Learning · Computer Science 2020-02-19 Kourosh Hakhamaneshi , Keertana Settaluri , Pieter Abbeel , Vladimir Stojanovic

The Differentiable Cross-Entropy Method

We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the…

Machine Learning · Computer Science 2020-08-18 Brandon Amos , Denis Yarats

A Simple Decentralized Cross-Entropy Method

Cross-Entropy Method (CEM) is commonly used for planning in model-based reinforcement learning (MBRL) where a centralized approach is typically utilized to update the sampling distribution based on only the top-$k$ operation's results on…

Machine Learning · Computer Science 2022-12-19 Zichen Zhang , Jun Jin , Martin Jagersand , Jun Luo , Dale Schuurmans

Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning

Model-based reinforcement learning (MBRL) with real-time planning has shown great potential in locomotion and manipulation control tasks. However, the existing planning methods, such as the Cross-Entropy Method (CEM), do not scale well to…

Machine Learning · Computer Science 2023-09-12 Mostafa Kotb , Cornelius Weber , Stefan Wermter

Combining Sampling- and Gradient-based Planning for Contact-rich Manipulation

Planning over discontinuous dynamics is needed for robotics tasks like contact-rich manipulation, which presents challenges in the numerical stability and speed of planning methods when either neural network or analytical models are used.…

Robotics · Computer Science 2024-03-26 Filippo Rozzi , Loris Roveda , Kevin Haninger

Safe Learning-based Gradient-free Model Predictive Control Based on Cross-entropy Method

In this paper, a safe and learning-based control framework for model predictive control (MPC) is proposed to optimize nonlinear systems with a non-differentiable objective function under uncertain environmental disturbances. The control…

Robotics · Computer Science 2022-02-22 Lei Zheng , Rui Yang , Zhixuan Wu , Jiesen Pan , Hui Cheng

Closing the Train-Test Gap in World Models for Gradient-Based Planning

World models paired with model predictive control (MPC) can be trained offline on large-scale datasets of expert trajectories and enable generalization to a wide range of planning tasks at inference time. Compared to traditional MPC…

Machine Learning · Computer Science 2025-12-11 Arjun Parthasarathy , Nimit Kalra , Rohun Agrawal , Yann LeCun , Oumayma Bounou , Pavel Izmailov , Micah Goldblum

Extremum-Seeking Action Selection for Accelerating Policy Optimization

Reinforcement learning for control over continuous spaces typically uses high-entropy stochastic policies, such as Gaussian distributions, for local exploration and estimating policy gradient to optimize performance. Many robotic control…

Machine Learning · Computer Science 2024-04-03 Ya-Chien Chang , Sicun Gao

Cross-Entropy Method Variants for Optimization

The cross-entropy (CE) method is a popular stochastic method for optimization due to its simplicity and effectiveness. Designed for rare-event simulations where the probability of a target event occurring is relatively small, the CE-method…

Machine Learning · Computer Science 2020-09-22 Robert J. Moss

Cross-Entropy Optimization for Hyperparameter Optimization in Stochastic Gradient-based Approaches to Train Deep Neural Networks

In this paper, we present a cross-entropy optimization method for hyperparameter optimization in stochastic gradient-based approaches to train deep neural networks. The value of a hyperparameter of a learning algorithm often has great…

Machine Learning · Computer Science 2024-09-17 Kevin Li , Fulu Li

Model predictive control with random batch methods for a guiding problem

We model, simulate and control the guiding problem for a herd of evaders under the action of repulsive drivers. The problem is formulated in an optimal control framework, where the drivers (controls) aim to guide the evaders (states) to a…

Optimization and Control · Mathematics 2020-05-01 Dongnam Ko , Enrique Zuazua

Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping

Many previous works approach vision-based robotic grasping by training a value network that evaluates grasp proposals. These approaches require an optimization process at run-time to infer the best action from the value network. As a…

Robotics · Computer Science 2019-04-17 Mengyuan Yan , Adrian Li , Mrinal Kalakrishnan , Peter Pastor

Local Entropy Search over Descent Sequences for Bayesian Optimization

Searching large and complex design spaces for a global optimum can be infeasible and unnecessary. A practical alternative is to iteratively refine the neighborhood of an initial design using local optimization methods such as gradient…

Machine Learning · Computer Science 2025-11-25 David Stenger , Armin Lindicke , Alexander von Rohr , Sebastian Trimpe

Dream-MPC: Gradient-Based Model Predictive Control with Latent Imagination

State-of-the-art model-based Reinforcement Learning (RL) approaches either use gradient-free, population-based methods for planning, learned policy networks, or a combination of policy networks and planning. Hybrid approaches that combine…

Machine Learning · Computer Science 2026-05-25 Jonathan Spieler , Sven Behnke

Model predictive control-based value estimation for efficient reinforcement learning

Reinforcement learning suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal…

Machine Learning · Computer Science 2024-10-28 Qizhen Wu , Kexin Liu , Lei Chen

Bregman Centroid Guided Cross-Entropy Method

The Cross-Entropy Method (CEM) is a widely adopted trajectory optimizer in model-based reinforcement learning (MBRL), but its unimodal sampling strategy often leads to premature convergence in multimodal landscapes. In this work, we propose…

Machine Learning · Computer Science 2025-07-02 Yuliang Gu , Hongpeng Cao , Marco Caccamo , Naira Hovakimyan

Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent

We consider the joint design and control of discrete-time stochastic dynamical systems over a finite time horizon. We formulate the problem as a multi-step optimization problem under uncertainty seeking to identify a system design and a…

Machine Learning · Computer Science 2022-01-07 Adrien Bolland , Ioannis Boukas , Mathias Berger , Damien Ernst