English
Related papers

Related papers: Sample Efficient Reinforcement Learning with Parti…

200 papers

This paper considers a class of reinforcement learning problems, which involve systems with two types of states: stochastic and pseudo-stochastic. In such systems, stochastic states follow a stochastic transition kernel while the…

Machine Learning · Computer Science 2023-11-09 Honghao Wei , Xin Liu , Weina Wang , Lei Ying

We study the sample complexity of online reinforcement learning in the general \hzyrev{non-episodic} setting of nonlinear dynamical systems with continuous state and action spaces. Our analysis accommodates a large class of dynamical…

Machine Learning · Computer Science 2026-03-02 Michael Muehlebach , Zhiyu He , Michael I. Jordan

Dynamic decision-making under distributional shifts is of fundamental interest in theory and applications of reinforcement learning: The distribution of the environment in which the data is collected can differ from that of the environment…

Machine Learning · Computer Science 2024-09-05 Shengbo Wang , Nian Si , Jose Blanchet , Zhengyuan Zhou

Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment. To counter the insufficient coverage and sample scarcity of many offline datasets, the principle…

Machine Learning · Computer Science 2022-06-14 Laixi Shi , Gen Li , Yuting Wei , Yuxin Chen , Yuejie Chi

We consider a reinforcement learning setting in which the deployment environment is different from the training environment. Applying a robust Markov decision processes formulation, we extend the distributionally robust $Q$-learning…

Machine Learning · Computer Science 2024-08-02 Shengbo Wang , Nian Si , Jose Blanchet , Zhengyuan Zhou

Applying reinforcement learning (RL) to real-world applications requires addressing a trade-off between asymptotic performance, sample efficiency, and inference time. In this work, we demonstrate how to address this triple challenge by…

Machine Learning · Computer Science 2024-07-03 Zakariae El Asri , Olivier Sigaud , Nicolas Thome

In an episodic Markov Decision Process (MDP) problem, an online algorithm chooses from a set of actions in a sequence of $H$ trials, where $H$ is the episode length, in order to maximize the total payoff of the chosen actions. Q-learning,…

Machine Learning · Computer Science 2019-07-11 Xu Zhu

Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic Markov decision process with $S$ states, $A$ actions and…

Machine Learning · Computer Science 2022-10-18 Gen Li , Laixi Shi , Yuxin Chen , Yuejie Chi

Reinforcement learning has witnessed significant advancements, particularly with the emergence of model-based approaches. Among these, $Q$-learning has proven to be a powerful algorithm in model-free settings. However, the extension of…

Machine Learning · Computer Science 2026-03-31 Han-Dong Lim , HyeAnn Lee , Donghwan Lee

Model-free reinforcement learning (RL) algorithms, such as Q-learning, directly parameterize and update value functions or policies without explicitly modeling the environment. They are typically simpler, more flexible to use, and thus more…

Machine Learning · Computer Science 2018-07-11 Chi Jin , Zeyuan Allen-Zhu , Sebastien Bubeck , Michael I. Jordan

Recent advances in batch (offline) reinforcement learning have shown promising results in learning from available offline data and proved offline reinforcement learning to be an essential toolkit in learning control policies in a model-free…

Machine Learning · Computer Science 2022-12-19 Ashish Kumar , Ilya Kuzovkin

Reinforcement learning (RL) methods have been shown to be capable of learning intelligent behavior in rich domains. However, this has largely been done in simulated domains without adequate focus on the process of building the simulator. In…

Machine Learning · Computer Science 2019-10-24 Aditya Modi , Nan Jiang , Ambuj Tewari , Satinder Singh

A long-standing problem in online reinforcement learning (RL) is of ensuring sample efficiency, which stems from an inability to explore environments efficiently. Most attempts at efficient exploration tackle this problem in a setting where…

Machine Learning · Computer Science 2025-07-08 Aman Mehra , Alexandre Capone , Jeff Schneider

Offline reinforcement learning aims to learn from pre-collected datasets without active exploration. This problem faces significant challenges, including limited data availability and distributional shifts. Existing approaches adopt a…

Machine Learning · Computer Science 2024-10-01 Yue Wang , Jinjun Xiong , Shaofeng Zou

We investigate robust model-free reinforcement learning algorithms designed for environments that may be dynamic or even adversarial. Traditional state-based policies often struggle to accommodate the challenges imposed by the presence of…

Machine Learning · Computer Science 2023-11-02 Udaya Ghai , Arushi Gupta , Wenhan Xia , Karan Singh , Elad Hazan

As a paradigm for sequential decision making in unknown environments, reinforcement learning (RL) has received a flurry of attention in recent years. However, the explosion of model complexity in emerging applications and the presence of…

Machine Learning · Statistics 2025-07-22 Yuejie Chi , Yuxin Chen , Yuting Wei

Reinforcement learning is commonly associated with training of reward-maximizing (or cost-minimizing) agents, in other words, controllers. It can be applied in model-free or model-based fashion, using a priori or online collected system…

Systems and Control · Electrical Eng. & Systems 2022-09-01 Lukas Beckenbach , Pavel Osinenko , Stefan Streif

We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces under a generative model. If $Q$-function is Lipschitz continuous, then the minimal sample…

Machine Learning · Computer Science 2020-06-12 Devavrat Shah , Dogyoon Song , Zhi Xu , Yuzhe Yang

Learning and planning in partially-observable domains is one of the most difficult problems in reinforcement learning. Traditional methods consider these two problems as independent, resulting in a classical two-stage paradigm: first learn…

Artificial Intelligence · Computer Science 2019-11-25 Tianyu Li , Bogdan Mazoure , Doina Precup , Guillaume Rabusseau

Data assimilation (DA) has increasingly emerged as a critical tool for state estimation across a wide range of applications. It is significantly challenging when the governing equations of the underlying dynamics are unknown. To this end,…

Machine Learning · Computer Science 2026-01-13 Ziyi Wang , Lijian Jiang
‹ Prev 1 2 3 10 Next ›