English
Related papers

Related papers: A Cross Entropy based Stochastic Approximation Alg…

200 papers

In this paper, we provide two new stable online algorithms for the problem of prediction in reinforcement learning, \emph{i.e.}, estimating the value function of a model-free Markov reward process using the linear function approximation…

Machine Learning · Computer Science 2018-06-19 Ajin George Joseph , Shalabh Bhatnagar

The cross entropy (CE) method is a model based search method to solve optimization problems where the objective function has minimal structure. The Monte-Carlo version of the CE method employs the naive sample averaging technique which is…

Artificial Intelligence · Computer Science 2018-02-01 Ajin George Joseph , Shalabh Bhatnagar

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or…

Machine Learning · Computer Science 2025-05-20 Jiashuo Jiang , Yiming Zong , Yinyu Ye

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

This paper deals with the estimation of rare event probabilities using importance sampling (IS), where an optimal proposal distribution is computed with the cross-entropy (CE) method. Although, IS optimized with the CE method leads to an…

Computation · Statistics 2020-02-05 Patrick Héas

We present an iterative inverse reinforcement learning algorithm to infer optimal cost functions in continuous spaces. Based on a popular maximum entropy criteria, our approach iteratively finds a weight improvement step and proposes a…

Machine Learning · Computer Science 2025-05-14 Sarmad Mehrdad , Avadesh Meduri , Ludovic Righetti

In this paper, we present a cross-entropy optimization method for hyperparameter optimization in stochastic gradient-based approaches to train deep neural networks. The value of a hyperparameter of a learning algorithm often has great…

Machine Learning · Computer Science 2024-09-17 Kevin Li , Fulu Li

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

The cross-entropy (CE) method is a popular stochastic method for optimization due to its simplicity and effectiveness. Designed for rare-event simulations where the probability of a target event occurring is relatively small, the CE-method…

Machine Learning · Computer Science 2020-09-22 Robert J. Moss

Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces. In this paper, we advance this effort by presenting…

Machine Learning · Computer Science 2024-10-04 Zakaria Mhammedi

Parameter estimation in Markov random fields (MRFs) is a difficult task, in which inference over the network is run in the inner loop of a gradient descent procedure. Replacing exact inference with approximate methods such as loopy belief…

Machine Learning · Computer Science 2012-06-18 Varun Ganapathi , David Vickrey , John Duchi , Daphne Koller

We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a…

Machine Learning · Computer Science 2022-08-23 The Viet Bui , Tien Mai , Patrick Jaillet

We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose transition probability is parametrized by an unknown transition core with features of state and action. Despite much recent progress in…

Machine Learning · Statistics 2024-11-19 Taehyun Hwang , Min-hwan Oh

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency…

In deep Reinforcement Learning (RL), value functions are typically approximated using deep neural networks and trained via mean squared error regression objectives to fit the true value functions. Recent research has proposed an alternative…

Machine Learning · Computer Science 2024-11-19 Denis Tarasov , Kirill Brilliantov , Dmitrii Kharlapenko

This paper studies the continuous-time reinforcement learning (RL) for optimal switching problems across multiple regimes. We consider a type of exploratory formulation under entropy regularization where the agent randomizes both the timing…

Optimization and Control · Mathematics 2025-12-23 Yijie Huang , Mengge Li , Xiang Yu , Zhou Zhou

To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider…

Machine Learning · Computer Science 2018-11-29 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

Large language models (LLMs) have shown promise in performing complex multi-step reasoning, yet they continue to struggle with mathematical reasoning, often making systematic errors. A promising solution is reinforcement learning (RL)…

Machine Learning · Computer Science 2025-09-22 Hanning Zhang , Pengcheng Wang , Shizhe Diao , Yong Lin , Rui Pan , Hanze Dong , Dylan Zhang , Pavlo Molchanov , Tong Zhang

Model-free deep-reinforcement-based learning algorithms have been applied to a range of COPs~\cite{bello2016neural}~\cite{kool2018attention}~\cite{nazari2018reinforcement}. However, these approaches suffer from two key challenges when…

Machine Learning · Computer Science 2022-06-01 Nasrin Sultana , Jeffrey Chan , Tabinda Sarwar , A. K. Qin

A basic simulation-based reinforcement learning algorithm is the Monte Carlo Exploring States (MCES) method, also known as optimistic policy iteration, in which the value function is approximated by simulated returns and a greedy policy is…

Optimization and Control · Mathematics 2020-07-22 Jun Liu
‹ Prev 1 2 3 10 Next ›