English
Related papers

Related papers: Efficient Reinforcement Learning in Deterministic …

200 papers

We design a new provably efficient algorithm for episodic reinforcement learning with generalized linear function approximation. We analyze the algorithm under a new expressivity assumption that we call "optimistic closure," which is…

Machine Learning · Statistics 2019-12-10 Yining Wang , Ruosong Wang , Simon S. Du , Akshay Krishnamurthy

We study the problem of reinforcement learning in infinite-horizon discounted linear Markov decision processes (MDPs), and propose the first computationally efficient algorithm achieving rate-optimal regret guarantees in this setting. Our…

Machine Learning · Computer Science 2026-03-16 Antoine Moulin , Gergely Neu , Luca Viano

This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by a finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model.…

Systems and Control · Electrical Eng. & Systems 2023-02-08 Arash Bahari Kordabad , Mario Zanon , Sebastien Gros

Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains. However, it is known that learning a good policy is difficult in a domain where…

Machine Learning · Computer Science 2019-06-27 Takahisa Imagawa , Takuya Hiraoka , Yoshimasa Tsuruoka

Reinforcement learning with outcome-based feedback faces a fundamental challenge: when rewards are only observed at trajectory endpoints, how do we assign credit to the right actions? This paper provides the first comprehensive analysis of…

Machine Learning · Computer Science 2025-07-25 Fan Chen , Zeyu Jia , Alexander Rakhlin , Tengyang Xie

Exploring the most task-friendly camera setting -- optimal camera placement (OCP) problem -- in tasks that use multiple cameras is of great importance. However, few existing OCP solutions specialize in depth observation of indoor scenes,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-22 Yichuan Chen , Manabu Tsukada , Hiroshi Esaki

While policy-based reinforcement learning (RL) achieves tremendous successes in practice, it is significantly less understood in theory, especially compared with value-based RL. In particular, it remains elusive how to design a provably…

Machine Learning · Computer Science 2024-04-02 Qi Cai , Zhuoran Yang , Chi Jin , Zhaoran Wang

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to…

Machine Learning · Statistics 2019-09-25 Ian Osband , Benjamin Van Roy , Daniel Russo , Zheng Wen

We consider online reinforcement learning in episodic Markov decision process (MDP) with unknown transition function and stochastic rewards drawn from some fixed but unknown distribution. The learner aims to learn the optimal policy and…

Machine Learning · Computer Science 2024-03-12 Vincent Leon , S. Rasoul Etesami

Reinforcement learning (RL) problems are fundamental in online decision-making and have been instrumental in finding an optimal policy for Markov decision processes (MDPs). Function approximations are usually deployed to handle large or…

Machine Learning · Computer Science 2025-05-20 Jiashuo Jiang , Yiming Zong , Yinyu Ye

We study offline Reinforcement Learning in large infinite-horizon discounted Markov Decision Processes (MDPs) when the reward and transition models are linearly realizable under a known feature map. Starting from the classic linear-program…

Machine Learning · Computer Science 2024-05-24 Gergely Neu , Nneka Okolo

Reinforcement Learning (RL) has achieved tremendous success in recent years. However, the classical foundations of RL do not account for the risk sensitivity of the objective function, which is critical in various fields, including…

Machine Learning · Computer Science 2025-11-14 Mohammad Alipour-Vaezi , Huaiyang Zhong , Kwok-Leung Tsui , Sajad Khodadadian

We develop a probabilistic framework for analysing model-based reinforcement learning in the episodic setting. We then apply it to study finite-time horizon stochastic control problems with linear dynamics but unknown coefficients and…

Machine Learning · Computer Science 2021-12-22 Lukasz Szpruch , Tanut Treetanthiploet , Yufei Zhang

Existing episodic reinforcement algorithms assume that the length of an episode is fixed across time and known a priori. In this paper, we consider a general framework of episodic reinforcement learning when the length of each episode is…

Machine Learning · Computer Science 2023-02-08 Debmalya Mandal , Goran Radanovic , Jiarui Gan , Adish Singla , Rupak Majumdar

We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or…

Machine Learning · Computer Science 2023-05-25 Chenjie Mao

We present a general convergent class of reinforcement learning algorithms that is founded on two distinct principles: (1) mapping value estimates to a different space using arbitrary functions from a broad class, and (2) linearly…

Machine Learning · Computer Science 2022-03-18 Mehdi Fatemi , Arash Tavakoli

In this theoretical paper we are concerned with the problem of learning a value function by a smooth general function approximator, to solve a deterministic episodic control problem in a large continuous state space. It is shown that…

Machine Learning · Computer Science 2011-01-04 Michael Fairbank , Eduardo Alonso

We consider the problem of learning the optimal policy for Markov decision processes with safety constraints. We formulate the problem in a reach-avoid setup. Our goal is to design online reinforcement learning algorithms that ensure safety…

Machine Learning · Computer Science 2026-01-21 Abhijit Mazumdar , Rafal Wisniewski , Manuela L. Bujorianu

We study online reinforcement learning for finite-horizon deterministic control systems with {\it arbitrary} state and action spaces. Suppose that the transition dynamics and reward function is unknown, but the state and action space is…

Machine Learning · Computer Science 2019-05-07 Lin F. Yang , Chengzhuo Ni , Mengdi Wang

A self-learning optimal control algorithm for episodic fixed-horizon manufacturing processes with time-discrete control actions is proposed and evaluated on a simulated deep drawing process. The control model is built during consecutive…

Systems and Control · Computer Science 2020-01-07 Johannes Dornheim , Norbert Link , Peter Gumbsch
‹ Prev 1 2 3 10 Next ›