Related papers: Exploiting Exogenous Structure for Sample-Efficien…

Learning in Markov Decision Processes with Exogenous Dynamics

Reinforcement learning algorithms are typically designed for generic Markov Decision Processes (MDPs), where any state-action pair can lead to an arbitrary transition distribution. In many practical systems, however, only a subset of the…

Machine Learning · Computer Science 2026-03-05 Davide Maran , Davide Salaorni , Marcello Restelli

Reinforcement Learning with Exogenous States and Rewards

Exogenous state variables and rewards can slow reinforcement learning by injecting uncontrolled variation into the reward signal. This paper formalizes exogenous state variables and rewards and shows that if the reward function decomposes…

Machine Learning · Computer Science 2026-01-15 George Trimponias , Thomas G. Dietterich

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been…

Machine Learning · Computer Science 2022-06-10 Yonathan Efroni , Dylan J. Foster , Dipendra Misra , Akshay Krishnamurthy , John Langford

Learning Compact Models for Planning with Exogenous Processes

We address the problem of approximate model minimization for MDPs in which the state is partitioned into endogenous and (much larger) exogenous components. An exogenous state variable is one whose dynamics are independent of the agent's…

Machine Learning · Computer Science 2019-10-01 Rohan Chitnis , Tomás Lozano-Pérez

Expert-Guided Symmetry Detection in Markov Decision Processes

Learning a Markov Decision Process (MDP) from a fixed batch of trajectories is a non-trivial task whose outcome's quality depends on both the amount and the diversity of the sampled regions of the state-action space. Yet, many MDPs are…

Machine Learning · Computer Science 2022-03-08 Giorgio Angelotti , Nicolas Drougard , Caroline P. C. Chanel

Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning

Exogenous state variables and rewards can slow down reinforcement learning by injecting uncontrolled variation into the reward signal. We formalize exogenous state variables and rewards and identify conditions under which an MDP with…

Machine Learning · Computer Science 2018-06-06 Thomas G. Dietterich , George Trimponias , Zhitang Chen

Model-free Reinforcement Learning for Branching Markov Decision Processes

We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of…

Machine Learning · Computer Science 2021-06-15 Ernst Moritz Hahn , Mateo Perez , Sven Schewe , Fabio Somenzi , Ashutosh Trivedi , Dominik Wojtczak

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning

A novel reinforcement learning scheme to synthesize policies for continuous-space Markov decision processes (MDPs) is proposed. This scheme enables one to apply model-free, off-the-shelf reinforcement learning algorithms for finite MDPs to…

Systems and Control · Electrical Eng. & Systems 2020-03-03 Abolfazl Lavaei , Fabio Somenzi , Sadegh Soudjani , Ashutosh Trivedi , Majid Zamani

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular setting where the state space $\mathcal{S}$ and the action space $\mathcal{A}$ are both finite, to obtain a nearly optimal policy with…

Machine Learning · Computer Science 2022-10-28 Bingyan Wang , Yuling Yan , Jianqing Fan

Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation?

Exogenous MDPs (Exo-MDPs) capture sequential decision-making where uncertainty comes solely from exogenous inputs that evolve independently of the learner's actions. This structure is especially common in operations research applications…

Machine Learning · Computer Science 2026-01-29 Hao Liang , Jiayu Cheng , Sean R. Sinclair , Yali Du

Efficient Solution and Learning of Robust Factored MDPs

Robust Markov decision processes (r-MDPs) extend MDPs by explicitly modelling epistemic uncertainty about transition dynamics. Learning r-MDPs from interactions with an unknown environment enables the synthesis of robust policies with…

Machine Learning · Computer Science 2025-11-21 Yannik Schnitzer , Alessandro Abate , David Parker

Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes

Modern large-scale computing deployments consist of complex applications running over machine clusters. An important issue in these is the offering of elasticity, i.e., the dynamic allocation of resources to applications to meet fluctuating…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-02-13 Konstantinos Lolos , Ioannis Konstantinou , Verena Kantere , Nectarios Koziris

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize…

Optimization and Control · Mathematics 2015-07-07 Mahmoud El Chamie , Behcet Acikmese

An Incremental Off-policy Search in a Model-free Markov Decision Process Using a Single Sample Path

In this paper, we consider a modified version of the control problem in a model free Markov decision process (MDP) setting with large state and action spaces. The control problem most commonly addressed in the contemporary literature is to…

Artificial Intelligence · Computer Science 2018-02-01 Ajin George Joseph , Shalabh Bhatnagar

Policy Dispersion in Non-Markovian Environment

Markov Decision Process (MDP) presents a mathematical framework to formulate the learning processes of agents in reinforcement learning. MDP is limited by the Markovian assumption that a reward only depends on the immediate state and…

Machine Learning · Computer Science 2024-06-04 Bohao Qu , Xiaofeng Cao , Jielong Yang , Hechang Chen , Chang Yi , Ivor W. Tsang , Yew-Soon Ong

Intrinsically Motivated Multimodal Structure Learning

We present a long-term intrinsically motivated structure learning method for modeling transition dynamics during controlled interactions between a robot and semi-permanent structures in the world. In particular, we discuss how…

Robotics · Computer Science 2016-07-18 Jay Ming Wong , Roderic A. Grupen

MDP modeling for multi-stage stochastic programs

We study a class of multi-stage stochastic programs, which incorporate modeling features from Markov decision processes (MDPs). This class includes structured MDPs with continuous action and state spaces. We extend policy graphs to include…

Machine Learning · Computer Science 2026-04-09 David P. Morton , Oscar Dowson , Bernardo K. Pagnoncelli

Learning non-Markovian Decision-Making from State-only Sequences

Conventional imitation learning assumes access to the actions of demonstrators, but these motor signals are often non-observable in naturalistic settings. Additionally, sequential decision-making behaviors in these settings can deviate from…

Machine Learning · Computer Science 2023-10-31 Aoyang Qin , Feng Gao , Qing Li , Song-Chun Zhu , Sirui Xie

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory

In order to train agents that can quickly adapt to new objectives or reward functions, efficient unsupervised representation learning in sequential decision-making environments can be important. Frameworks such as the Exogenous Block Markov…

Machine Learning · Computer Science 2025-03-18 Alexander Levine , Peter Stone , Amy Zhang