Related papers: Active Model Estimation in Markov Decision Process…

Active Exploration in Markov Decision Processes

We introduce the active exploration problem in Markov decision processes (MDPs). Each state of the MDP is characterized by a random value and the learner should gather samples to estimate the mean value of each state as accurately as…

Machine Learning · Statistics 2019-03-01 Jean Tarbouriech , Alessandro Lazaric

Navigating to the Best Policy in Markov Decision Processes

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible. We…

Machine Learning · Statistics 2021-10-26 Aymen Al Marjani , Aurélien Garivier , Alexandre Proutiere

Provably Efficient Maximum Entropy Exploration

Suppose an agent is in a (possibly unknown) Markov Decision Process in the absence of a reward signal, what might we hope that an agent can efficiently learn to do? This work studies a broad class of objectives that are defined solely as…

Machine Learning · Computer Science 2019-01-29 Elad Hazan , Sham M. Kakade , Karan Singh , Abby Van Soest

Learning Algorithms for Verification of Markov Decision Processes

We present a general framework for applying learning algorithms and heuristical guidance to the verification of Markov decision processes (MDPs). The primary goal of our techniques is to improve performance by avoiding an exhaustive…

Systems and Control · Electrical Eng. & Systems 2025-04-02 Tomáš Brázdil , Krishnendu Chatterjee , Martin Chmelik , Vojtěch Forejt , Jan Křetínský , Marta Kwiatkowska , Tobias Meggendorfer , David Parker , Mateusz Ujma

Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation

This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments. The proposed method computes the maximally likely model of the environment, given the observations…

Machine Learning · Computer Science 2021-02-09 Melkior Ornik , Ufuk Topcu

Expert-Guided Symmetry Detection in Markov Decision Processes

Learning a Markov Decision Process (MDP) from a fixed batch of trajectories is a non-trivial task whose outcome's quality depends on both the amount and the diversity of the sampled regions of the state-action space. Yet, many MDPs are…

Machine Learning · Computer Science 2022-03-08 Giorgio Angelotti , Nicolas Drougard , Caroline P. C. Chanel

Receding Horizon Curiosity

Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion. A principled treatment of the problem of optimal input synthesis for system…

Machine Learning · Computer Science 2019-10-10 Matthias Schultheis , Boris Belousov , Hany Abdulsamad , Jan Peters

Robust Markov Decision Processes without Model Estimation

Robust Markov Decision Processes (MDPs) are receiving much attention in learning a robust policy which is less sensitive to environment changes. There are an increasing number of works analyzing sample-efficiency of robust MDPs. However,…

Machine Learning · Statistics 2023-09-13 Wenhao Yang , Han Wang , Tadashi Kozuno , Scott M. Jordan , Zhihua Zhang

Robust Entropy-regularized Markov Decision Processes

Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with…

Machine Learning · Computer Science 2022-01-03 Tien Mai , Patrick Jaillet

Markov Decision Processes with Noisy State Observation

This paper addresses the challenge of a particular class of noisy state observations in Markov Decision Processes (MDPs), a common issue in various real-world applications. We focus on modeling this uncertainty through a confusion matrix…

Machine Learning · Computer Science 2023-12-15 Amirhossein Afsharrad , Sanjay Lall

An Incremental Off-policy Search in a Model-free Markov Decision Process Using a Single Sample Path

In this paper, we consider a modified version of the control problem in a model free Markov decision process (MDP) setting with large state and action spaces. The control problem most commonly addressed in the contemporary literature is to…

Artificial Intelligence · Computer Science 2018-02-01 Ajin George Joseph , Shalabh Bhatnagar

A Lazy Abstraction Algorithm for Markov Decision Processes: Theory and Initial Evaluation

Analysis of Markov Decision Processes (MDP) is often hindered by state space explosion. Abstraction is a well-established technique in model checking to mitigate this issue. This paper presents a novel lazy abstraction method for MDP…

Logic in Computer Science · Computer Science 2024-06-04 Dániel Szekeres , Kristóf Marussy , István Majzik

A Markov Decision Process Approach to Active Meta Learning

In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast,…

Machine Learning · Computer Science 2020-09-11 Bingjia Wang , Alec Koppel , Vikram Krishnamurthy

Cost-Bounded Active Classification Using Partially Observable Markov Decision Processes

Active classification, i.e., the sequential decision-making process aimed at data acquisition for classification purposes, arises naturally in many applications, including medical diagnosis, intrusion detection, and object tracking. In this…

Systems and Control · Computer Science 2018-10-02 Bo Wu , Mohamadreza Ahmadi , Suda Bharadwaj , Ufuk Topcu

Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs

We study reward-free reinforcement learning (RL) with linear function approximation, where the agent works in two phases: (1) in the exploration phase, the agent interacts with the environment but cannot access the reward; and (2) in the…

Machine Learning · Computer Science 2024-02-15 Junkai Zhang , Weitong Zhang , Quanquan Gu

Robustness to Modeling Errors in Risk-Sensitive Markov Decision Problems with Markov Risk Measures

We consider risk-sensitive Markov decision processes (MDPs), where the MDP model is influenced by a parameter which takes values in a compact metric space. We identify sufficient conditions under which small perturbations in the model…

Optimization and Control · Mathematics 2022-09-28 Shiping Shao , Abhishek Gupta , William B. Haskell

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

In many real-world applications (e.g., planetary exploration, robot navigation), an autonomous agent must be able to explore a space with guaranteed safety. Most safe exploration algorithms in the field of reinforcement learning and…

Artificial Intelligence · Computer Science 2018-09-13 Akifumi Wachi , Hiroshi Kajino , Asim Munawar

Finding good policies in average-reward Markov Decision Processes without prior knowledge

We revisit the identification of an $\varepsilon$-optimal policy in average-reward Markov Decision Processes (MDP). In such MDPs, two measures of complexity have appeared in the literature: the diameter, $D$, and the optimal bias span, $H$,…

Machine Learning · Computer Science 2024-05-28 Adrienne Tuynman , Rémy Degenne , Emilie Kaufmann

Robust Active Measuring under Model Uncertainty

Partial observability and uncertainty are common problems in sequential decision-making that particularly impede the use of formal models such as Markov decision processes (MDPs). However, in practice, agents may be able to employ costly…

Machine Learning · Computer Science 2023-12-19 Merlijn Krale , Thiago D. Simão , Jana Tumova , Nils Jansen

Bayesian Policy Optimization for Model Uncertainty

Addressing uncertainty is critical for autonomous systems to robustly adapt to the real world. We formulate the problem of model uncertainty as a continuous Bayes-Adaptive Markov Decision Process (BAMDP), where an agent maintains a…

Robotics · Computer Science 2019-05-09 Gilwoo Lee , Brian Hou , Aditya Mandalika , Jeongseok Lee , Sanjiban Choudhury , Siddhartha S. Srinivasa