Related papers: A PAC RL Algorithm for Episodic POMDPs

Provably Efficient Partially Observable Risk-Sensitive Reinforcement Learning with Hindsight Observation

This work pioneers regret analysis of risk-sensitive reinforcement learning in partially observable environments with hindsight observation, addressing a gap in theoretical exploration. We introduce a novel formulation that integrates…

Machine Learning · Computer Science 2024-02-29 Tonghe Zhang , Yu Chen , Longbo Huang

When Is Partially Observable Reinforcement Learning Not Scary?

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the…

Machine Learning · Computer Science 2022-05-26 Qinghua Liu , Alan Chung , Csaba Szepesvári , Chi Jin

PAC Reinforcement Learning for Predictive State Representations

In this paper we study online Reinforcement Learning (RL) in partially observable dynamical systems. We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models such as…

Machine Learning · Computer Science 2022-08-16 Wenhao Zhan , Masatoshi Uehara , Wen Sun , Jason D. Lee

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

In probably approximately correct (PAC) reinforcement learning (RL), an agent is required to identify an $\epsilon$-optimal policy with probability $1-\delta$. While minimax optimal algorithms exist for this problem, its instance-dependent…

Machine Learning · Computer Science 2022-10-25 Andrea Tirinzoni , Aymen Al-Marjani , Emilie Kaufmann

Provable Reinforcement Learning with a Short-Term Memory

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions. Coping with partial…

Machine Learning · Computer Science 2022-02-09 Yonathan Efroni , Chi Jin , Akshay Krishnamurthy , Sobhan Miryoosefi

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration. This challenge leads to a number…

Machine Learning · Computer Science 2020-10-27 Chi Jin , Sham M. Kakade , Akshay Krishnamurthy , Qinghua Liu

Lower Bounds for Learning in Revealing POMDPs

This paper studies the fundamental limits of reinforcement learning (RL) in the challenging \emph{partially observable} setting. While it is well-established that learning in Partially Observable Markov Decision Processes (POMDPs) requires…

Machine Learning · Computer Science 2023-02-03 Fan Chen , Huan Wang , Caiming Xiong , Song Mei , Yu Bai

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Statistical performance bounds for reinforcement learning (RL) algorithms can be critical for high-stakes applications like healthcare. This paper introduces a new framework for theoretically measuring the performance of such algorithms…

Machine Learning · Computer Science 2018-01-03 Christoph Dann , Tor Lattimore , Emma Brunskill

Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

We study the reinforcement learning (RL) problem in a constrained Markov decision process (CMDP), where an agent explores the environment to maximize the expected cumulative reward while satisfying a single constraint on the expected total…

Machine Learning · Computer Science 2026-01-29 Toshinori Kitamura , Arnob Ghosh , Tadashi Kozuno , Wataru Kumagai , Kazumi Kasaura , Kenta Hoshino , Yohei Hosoe , Yutaka Matsuo

Variational Recurrent Models for Solving Partially Observable Control Tasks

In partially observable (PO) environments, deep reinforcement learning (RL) agents often suffer from unsatisfactory performance, since two problems need to be tackled together: how to extract information from the raw observations to solve…

Machine Learning · Computer Science 2019-12-25 Dongqi Han , Kenji Doya , Jun Tani

Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms

Partial Observability -- where agents can only observe partial information about the true underlying state of the system -- is ubiquitous in real-world applications of Reinforcement Learning (RL). Theoretically, learning a near-optimal…

Machine Learning · Computer Science 2022-12-19 Fan Chen , Yu Bai , Song Mei

Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

In recent years, reinforcement learning has achieved many remarkable successes due to the growing adoption of deep learning techniques and the rapid growth in computing power. Nevertheless, it is well-known that flat reinforcement learning…

Artificial Intelligence · Computer Science 2024-10-30 Le Pham Tuyen , Ngo Anh Vien , Abu Layek , TaeChoong Chung

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

We study Reinforcement Learning for partially observable dynamical systems using function approximation. We propose a new \textit{Partially Observable Bilinear Actor-Critic framework}, that is general enough to include models such as…

Machine Learning · Computer Science 2022-06-27 Masatoshi Uehara , Ayush Sekhari , Jason D. Lee , Nathan Kallus , Wen Sun

Real-Time Recurrent Reinforcement Learning

We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs). The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian…

Machine Learning · Computer Science 2025-04-17 Julian Lemmel , Radu Grosu

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable…

Artificial Intelligence · Computer Science 2023-04-20 Soichiro Nishimori , Sotetsu Koyamada , Shin Ishii

Provable Representation with Efficient Planning for Partial Observable Reinforcement Learning

In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption and leads to inferior performance for algorithms that conflate observations with…

Machine Learning · Computer Science 2024-06-12 Hongming Zhang , Tongzheng Ren , Chenjun Xiao , Dale Schuurmans , Bo Dai

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

There have been many recent advances on provably efficient Reinforcement Learning (RL) in problems with rich observation spaces. However, all these works share a strong realizability assumption about the optimal value function of the true…

Machine Learning · Computer Science 2021-06-23 Christoph Dann , Yishay Mansour , Mehryar Mohri , Ayush Sekhari , Karthik Sridharan

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

We study reinforcement learning with function approximation for large-scale Partially Observable Markov Decision Processes (POMDPs) where the state space and observation space are large or even continuous. Particularly, we consider Hilbert…

Machine Learning · Computer Science 2022-06-27 Masatoshi Uehara , Ayush Sekhari , Jason D. Lee , Nathan Kallus , Wen Sun

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

Recently, there has been significant progress in understanding reinforcement learning in discounted infinite-horizon Markov decision processes (MDPs) by deriving tight sample complexity bounds. However, in many real-world applications, an…

Machine Learning · Statistics 2016-05-12 Christoph Dann , Emma Brunskill

Partially Observable Multi-Agent Reinforcement Learning with Information Sharing

We study provable multi-agent reinforcement learning (RL) in the general framework of partially observable stochastic games (POSGs). To circumvent the known hardness results and the use of computationally intractable oracles, we advocate…

Machine Learning · Computer Science 2026-03-16 Xiangyu Liu , Kaiqing Zhang