English
Related papers

Related papers: Learning POMDPs with Linear Function Approximation…

200 papers

We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for…

Optimization and Control · Mathematics 2025-01-20 Ali Devran Kara , Erhan Bayraktar , Serdar Yuksel

In this paper, for POMDPs, we provide the convergence of a Q learning algorithm for control policies using a finite history of past observations and control actions, and, consequentially, we establish near optimality of such limit Q…

Machine Learning · Computer Science 2022-10-27 Ali Devran Kara , Serdar Yuksel

We study reinforcement learning methods with linear function approximation under non-Markov state and cost processes. We first consider the policy evaluation method and show that the algorithm converges under suitable ergodicity conditions…

Machine Learning · Computer Science 2026-01-05 Ali Devran Kara

Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the…

Computer Science and Game Theory · Computer Science 2022-09-29 Krishnendu Chatterjee , Raimundo Saona , Bruno Ziliotto

In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of optimal policies have in general been established via converting the original partially observed stochastic control problem to a fully observed one on the…

Optimization and Control · Mathematics 2022-01-11 Ali Devran Kara , Serdar Yuksel

We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive)…

Artificial Intelligence · Computer Science 2017-06-20 Kamyar Azizzadenesheli , Alessandro Lazaric , Animashree Anandkumar

The continuous nature of belief states in POMDPs presents significant computational challenges in learning the optimal policy. In this paper, we consider an approach that solves a Partially Observable Reinforcement Learning (PORL) problem…

Machine Learning · Computer Science 2025-10-15 Ameya Anjarlekar , Rasoul Etesami , R Srikant

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions. Coping with partial…

Machine Learning · Computer Science 2022-02-09 Yonathan Efroni , Chi Jin , Akshay Krishnamurthy , Sobhan Miryoosefi

We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive)…

Artificial Intelligence · Computer Science 2017-06-20 Kamyar Azizzadenesheli , Alessandro Lazaric , Animashree Anandkumar

We consider the reinforcement learning problem for partially observed Markov decision processes (POMDPs) with large or even countably infinite state spaces, where the controller has access to only noisy observations of the underlying…

Machine Learning · Computer Science 2023-07-20 Semih Cayci , Niao He , R. Srikant

We study model-based learning of finite-window policies in tabular partially observable Markov decision processes (POMDPs). A common approach to learning under partial observability is to approximate unbounded history dependencies using…

Machine Learning · Computer Science 2026-04-02 Philip Jordan , Maryam Kamgarpour

We study reinforcement learning for partially observed Markov decision processes (POMDPs) with infinite observation and state spaces, which remains less investigated theoretically. To this end, we make the first attempt at bridging partial…

Machine Learning · Computer Science 2024-04-02 Qi Cai , Zhuoran Yang , Zhaoran Wang

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a…

Artificial Intelligence · Computer Science 2011-06-02 M. Hauskrecht

We study Reinforcement Learning for partially observable dynamical systems using function approximation. We propose a new \textit{Partially Observable Bilinear Actor-Critic framework}, that is general enough to include models such as…

Machine Learning · Computer Science 2022-06-27 Masatoshi Uehara , Ayush Sekhari , Jason D. Lee , Nathan Kallus , Wen Sun

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors, inducing confounding and biasing estimates…

Machine Learning · Computer Science 2023-03-24 Andrew Bennett , Nathan Kallus

The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how…

Machine Learning · Computer Science 2024-12-11 Tian Tian , Lin F. Yang , Csaba Szepesvári

Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability…

Machine Learning · Computer Science 2023-09-08 Ali Devran Kara , Naci Saldi , Serdar Yüksel

Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to…

Artificial Intelligence · Computer Science 2012-07-02 Scott Sanner , Craig Boutilier

Memoryless and finite-memory policies offer a practical alternative for solving partially observable Markov decision processes (POMDPs), as they operate directly in the output space rather than in the high-dimensional belief space. However,…

Machine Learning · Computer Science 2025-12-15 Roy van Zuijlen , Duarte Antunes
‹ Prev 1 2 3 10 Next ›