Related papers: Optimistic MLE -- A Generic Model-based Algorithm …

Sequential Stochastic Optimization in Separable Learning Environments

We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts. These problems have a completely observed state process and a partially observed modulation…

Optimization and Control · Mathematics 2021-08-24 R. Reid Bishop , Chelsea C. White

Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms

Partial Observability -- where agents can only observe partial information about the true underlying state of the system -- is ubiquitous in real-world applications of Reinforcement Learning (RL). Theoretically, learning a near-optimal…

Machine Learning · Computer Science 2022-12-19 Fan Chen , Yu Bai , Song Mei

Learning in Observable POMDPs, without Computationally Intractable Oracles

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms…

Machine Learning · Computer Science 2022-06-08 Noah Golowich , Ankur Moitra , Dhruv Rohatgi

When Is Partially Observable Reinforcement Learning Not Scary?

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the…

Machine Learning · Computer Science 2022-05-26 Qinghua Liu , Alan Chung , Csaba Szepesvári , Chi Jin

Provable Reinforcement Learning with a Short-Term Memory

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions. Coping with partial…

Machine Learning · Computer Science 2022-02-09 Yonathan Efroni , Chi Jin , Akshay Krishnamurthy , Sobhan Miryoosefi

A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains

Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterministic and the state of the world is not completely observable. It is difficult to solve POMDPs…

Artificial Intelligence · Computer Science 2009-09-25 N. L. Zhang , W. Liu

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Learning a transition model via Maximum Likelihood Estimation (MLE) followed by planning inside the learned model is perhaps the most standard and simplest Model-based Reinforcement Learning (RL) framework. In this work, we show that such a…

Machine Learning · Computer Science 2024-10-30 Zhiyong Wang , Dongruo Zhou , John C. S. Lui , Wen Sun

Offline Risk-sensitive RL with Partial Observability to Enhance Performance in Human-Robot Teaming

The integration of physiological computing into mixed-initiative human-robot interaction systems offers valuable advantages in autonomous task allocation by incorporating real-time features as human state observations into the…

Multiagent Systems · Computer Science 2024-02-09 Giorgio Angelotti , Caroline P. C. Chanel , Adam H. M. Pinto , Christophe Lounis , Corentin Chauffaut , Nicolas Drougard

Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting

Dealing with Partially Observable Markov Decision Processes is notably a challenging task. We face an average-reward infinite-horizon POMDP setting with an unknown transition model, where we assume the knowledge of the observation model.…

Machine Learning · Computer Science 2024-10-03 Alessio Russo , Alberto Maria Metelli , Marcello Restelli

LLM-Guided Probabilistic Program Induction for POMDP Model Estimation

Partially Observable Markov Decision Processes (POMDPs) model decision making under uncertainty. While there are many approaches to approximately solving POMDPs, we aim to address the problem of learning such models. In particular, we are…

Artificial Intelligence · Computer Science 2025-05-13 Aidan Curtis , Hao Tang , Thiago Veloso , Kevin Ellis , Joshua Tenenbaum , Tomás Lozano-Pérez , Leslie Pack Kaelbling

Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each…

Machine Learning · Computer Science 2023-08-16 Lucas N. Alegre , Ana L. C. Bazzan , Diederik M. Roijers , Ann Nowé , Bruno C. da Silva

Qualitative Analysis of Partially-observable Markov Decision Processes

We study observation-based strategies for partially-observable Markov decision processes (POMDPs) with omega-regular objectives. An observation-based strategy relies on partial information about the history of a play, namely, on the past…

Logic in Computer Science · Computer Science 2015-05-14 Krishnendu Chatterjee , Laurent Doyen , Thomas A. Henzinger

Towards Tractable Optimism in Model-Based Reinforcement Learning

The principle of optimism in the face of uncertainty is prevalent throughout sequential decision making problems such as multi-armed bandits and reinforcement learning (RL). To be successful, an optimistic RL algorithm must over-estimate…

Machine Learning · Computer Science 2021-12-07 Aldo Pacchiano , Philip J. Ball , Jack Parker-Holder , Krzysztof Choromanski , Stephen Roberts

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the…

Machine Learning · Computer Science 2022-10-18 Qinghua Liu , Csaba Szepesvári , Chi Jin

SOMBRL: Scalable and Optimistic Model-Based RL

We address the challenge of efficient exploration in model-based reinforcement learning (MBRL), where the system dynamics are unknown and the RL agent must learn directly from online interactions. We propose Scalable and Optimistic MBRL…

Machine Learning · Computer Science 2025-11-26 Bhavya Sukhija , Lenart Treven , Carmelo Sferrazza , Florian Dörfler , Pieter Abbeel , Andreas Krause

What should be observed for optimal reward in POMDPs?

Partially observable Markov Decision Processes (POMDPs) are a standard model for agents making decisions in uncertain environments. Most work on POMDPs focuses on synthesizing strategies based on the available capabilities. However, system…

Artificial Intelligence · Computer Science 2024-07-12 Alyzia-Maria Konsta , Alberto Lluch Lafuente , Christoph Matheja

Planning in Observable POMDPs in Quasipolynomial Time

Partially Observable Markov Decision Processes (POMDPs) are a natural and general model in reinforcement learning that take into account the agent's uncertainty about its current state. In the literature on POMDPs, it is customary to assume…

Machine Learning · Computer Science 2022-03-24 Noah Golowich , Ankur Moitra , Dhruv Rohatgi

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration. This challenge leads to a number…

Machine Learning · Computer Science 2020-10-27 Chi Jin , Sham M. Kakade , Akshay Krishnamurthy , Qinghua Liu

A Margin-based MLE for Crowdsourced Partial Ranking

A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order. However, in real-world scenarios, some items are intrinsically ambiguous in comparisons, which may very well be an…

Machine Learning · Computer Science 2018-07-31 Qianqian Xu , Jiechao Xiong , Xinwei Sun , Zhiyong Yang , Xiaochun Cao , Qingming Huang , Yuan Yao

Reinforcement Learning based on MPC/MHE for Unmodeled and Partially Observable Dynamics

This paper proposes an observer-based framework for solving Partially Observable Markov Decision Processes (POMDPs) when an accurate model is not available. We first propose to use a Moving Horizon Estimation-Model Predictive Control…

Systems and Control · Electrical Eng. & Systems 2021-03-23 Hossein Nejatbakhsh Esfahani , Arash Bahari Kordabad , Sebastien Gros