Related papers: Sequential Knockoffs for Variable Selection in Rei…

Feature Selection Using Reinforcement Learning

With the decreasing cost of data collection, the space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Therefore, identifying the most characterizing features…

Machine Learning · Computer Science 2021-01-26 Sali Rasoul , Sodiq Adewole , Alphonse Akakpo

Provable Reinforcement Learning with a Short-Term Memory

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions. Coping with partial…

Machine Learning · Computer Science 2022-02-09 Yonathan Efroni , Chi Jin , Akshay Krishnamurthy , Sobhan Miryoosefi

Feature Markov Decision Processes

General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes…

Artificial Intelligence · Computer Science 2009-12-30 Marcus Hutter

Reinforcement Learning When All Actions are Not Always Available

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not efficiently capture the setting where the set of available decisions (actions) at each time step is stochastic.…

Machine Learning · Computer Science 2020-01-22 Yash Chandak , Georgios Theocharous , Blossom Metevier , Philip S. Thomas

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular setting where the state space $\mathcal{S}$ and the action space $\mathcal{A}$ are both finite, to obtain a nearly optimal policy with…

Machine Learning · Computer Science 2022-10-28 Bingyan Wang , Yuling Yan , Jianqing Fan

Solving Continual Combinatorial Selection via Deep Reinforcement Learning

We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL)…

Machine Learning · Computer Science 2019-09-10 Hyungseok Song , Hyeryung Jang , Hai H. Tran , Se-eun Yoon , Kyunghwan Son , Donggyu Yun , Hyoju Chung , Yung Yi

Reinforcement Learning of Sequential Price Mechanisms

We introduce the use of reinforcement learning for indirect mechanisms, working with the existing class of sequential price mechanisms, which generalizes both serial dictatorship and posted price mechanisms and essentially characterizes all…

Computer Science and Game Theory · Computer Science 2021-05-07 Gianluca Brero , Alon Eden , Matthias Gerstgrasser , David C. Parkes , Duncan Rheingans-Yoo

Selecting the State-Representation in Reinforcement Learning

The problem of selecting the right state-representation in a reinforcement learning problem is considered. Several models (functions mapping past observations to a finite set) of the observations are given, and it is known that for at least…

Machine Learning · Computer Science 2013-02-12 Odalric-Ambrym Maillard , Rémi Munos , Daniil Ryabko

Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making

The Markov assumption (MA) is fundamental to the empirical validity of reinforcement learning. In this paper, we propose a novel Forward-Backward Learning procedure to test MA in sequential decision making. The proposed test does not assume…

Machine Learning · Statistics 2020-02-06 Chengchun Shi , Runzhe Wan , Rui Song , Wenbin Lu , Ling Leng

Reinforcement Learning in Reward-Mixing MDPs

Learning a near optimal policy in a partially observable system remains an elusive challenge in contemporary reinforcement learning. In this work, we consider episodic reinforcement learning in a reward-mixing Markov decision process (MDP).…

Machine Learning · Computer Science 2022-02-01 Jeongyeol Kwon , Yonathan Efroni , Constantine Caramanis , Shie Mannor

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are…

Machine Learning · Computer Science 2022-10-17 Anna Winnicki , R. Srikant

Safe Reinforcement Learning in Constrained Markov Decision Processes

Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications. In this paper, we propose an algorithm, SNO-MDP, that explores and optimizes Markov decision…

Machine Learning · Computer Science 2020-08-18 Akifumi Wachi , Yanan Sui

Reinforcement Learning in Switching Non-Stationary Markov Decision Processes: Algorithms and Convergence Analysis

Reinforcement learning in non-stationary environments is challenging due to abrupt and unpredictable changes in dynamics, often causing traditional algorithms to fail to converge. However, in many real-world cases, non-stationarity has some…

Machine Learning · Computer Science 2025-03-25 Mohsen Amiri , Sindri Magnússon

Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches…

Machine Learning · Computer Science 2022-06-20 Tom Blau , Edwin V. Bonilla , Iadine Chades , Amir Dezfouli

Strategy Synthesis in Markov Decision Processes Under Limited Sampling Access

A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision…

Machine Learning · Computer Science 2023-04-25 Christel Baier , Clemens Dubslaff , Patrick Wienhöft , Stefan J. Kiebel

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

We consider a reinforcement learning setting introduced in (Maillard et al., NIPS 2011) where the learner does not have explicit access to the states of the underlying Markov decision process (MDP). Instead, she has access to several models…

Machine Learning · Computer Science 2014-09-16 Ronald Ortner , Odalric-Ambrym Maillard , Daniil Ryabko

Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information

Partially observable Markov decision processes (POMDPs) are a general framework for sequential decision-making under latent state uncertainty, yet learning in POMDPs is intractable in the worst case. Motivated by sensing and probing…

Machine Learning · Computer Science 2026-01-27 Ming Shi , Yingbin Liang , Ness B. Shroff

Reinforcement Learning for Task Specifications with Action-Constraints

In this paper, we use concepts from supervisory control theory of discrete event systems to propose a method to learn optimal control policies for a finite-state Markov Decision Process (MDP) in which (only) certain sequences of actions are…

Machine Learning · Computer Science 2022-01-04 Arun Raman , Keerthan Shagrithaya , Shalabh Bhatnagar

On Reward Structures of Markov Decision Processes

A Markov decision process can be parameterized by a transition kernel and a reward function. Both play essential roles in the study of reinforcement learning as evidenced by their presence in the Bellman equations. In our inquiry of various…

Machine Learning · Computer Science 2023-09-04 Falcon Z. Dai