Related papers: Reinforcement Learning for Nested Polar Code Const…

On Reward Structures of Markov Decision Processes

A Markov decision process can be parameterized by a transition kernel and a reward function. Both play essential roles in the study of reinforcement learning as evidenced by their presence in the Bellman equations. In our inquiry of various…

Machine Learning · Computer Science 2023-09-04 Falcon Z. Dai

Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning

Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision…

Machine Learning · Computer Science 2025-05-26 Maximilian Nägele , Jan Olle , Thomas Fösel , Remmy Zen , Florian Marquardt

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms…

Machine Learning · Computer Science 2021-08-24 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar

Reinforcement Learning of Markov Decision Processes with Peak Constraints

In this paper, we consider reinforcement learning of Markov Decision Processes (MDP) with peak constraints, where an agent chooses a policy to optimize an objective and at the same time satisfy additional constraints. The agent has to take…

Optimization and Control · Mathematics 2019-12-09 Ather Gattami

Construction of Polar Codes with Reinforcement Learning

This paper formulates the polar-code construction problem for the successive-cancellation list (SCL) decoder as a maze-traversing game, which can be solved by reinforcement learning techniques. The proposed method provides a novel technique…

Information Theory · Computer Science 2020-09-22 Yun Liao , Seyyed Ali Hashemi , John Cioffi , Andrea Goldsmith

A reinforcement learning based decision support system in textile manufacturing process

This paper introduced a reinforcement learning based decision support system in textile manufacturing process. A solution optimization problem of color fading ozonation is discussed and set up as a Markov Decision Process (MDP) in terms of…

Machine Learning · Computer Science 2020-05-21 Zhenglei He , Kim Phuc Tran , Sébastien Thomassey , Xianyi Zeng , Changhai Yi

Reinforcement Learning in Switching Non-Stationary Markov Decision Processes: Algorithms and Convergence Analysis

Reinforcement learning in non-stationary environments is challenging due to abrupt and unpredictable changes in dynamics, often causing traditional algorithms to fail to converge. However, in many real-world cases, non-stationarity has some…

Machine Learning · Computer Science 2025-03-25 Mohsen Amiri , Sindri Magnússon

Safe Reinforcement Learning in Constrained Markov Decision Processes

Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications. In this paper, we propose an algorithm, SNO-MDP, that explores and optimizes Markov decision…

Machine Learning · Computer Science 2020-08-18 Akifumi Wachi , Yanan Sui

Action-List Reinforcement Learning Syndrome Decoding for Binary Linear Block Codes

This paper explores the application of reinforcement learning techniques to enhance the performance of decoding of linear block codes based on flipping bits and finding optimal decisions. We describe the methodology for mapping the…

Information Theory · Computer Science 2025-07-29 Milad Taghipour , Bane Vasic

Reinforcement Learning with Non-Markovian Rewards

The standard RL world model is that of a Markov Decision Process (MDP). A basic premise of MDPs is that the rewards depend on the last state and action only. Yet, many real-world rewards are non-Markovian. For example, a reward for bringing…

Artificial Intelligence · Computer Science 2019-12-06 Maor Gaon , Ronen I. Brafman

Reinforcement Learning with Exogenous States and Rewards

Exogenous state variables and rewards can slow reinforcement learning by injecting uncontrolled variation into the reward signal. This paper formalizes exogenous state variables and rewards and shows that if the reward function decomposes…

Machine Learning · Computer Science 2026-01-15 George Trimponias , Thomas G. Dietterich

Active Task-Inference-Guided Deep Inverse Reinforcement Learning

We consider the problem of reward learning for temporally extended tasks. For reward learning, inverse reinforcement learning (IRL) is a widely used paradigm. Given a Markov decision process (MDP) and a set of demonstrations for a task, IRL…

Robotics · Computer Science 2021-07-14 Farzan Memarian , Zhe Xu , Bo Wu , Min Wen , Ufuk Topcu

A Validation Tool for Designing Reinforcement Learning Environments

Reinforcement learning (RL) has gained increasing attraction in the academia and tech industry with launches to a variety of impactful applications and products. Although research is being actively conducted on many fronts (e.g., offline…

Machine Learning · Computer Science 2021-12-13 Ruiyang Xu , Zhengxing Chen

Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes

Average-reward Markov decision processes (MDPs) provide a foundational framework for sequential decision-making under uncertainty. However, average-reward MDPs have remained largely unexplored in reinforcement learning (RL) settings, with…

Machine Learning · Computer Science 2025-08-29 Juan Sebastian Rojas , Chi-Guhn Lee

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic…

Machine Learning · Computer Science 2020-04-17 Colin Bellinger , Rory Coles , Mark Crowley , Isaac Tamblyn

Model-Based Reinforcement Learning with Multinomial Logistic Function Approximation

We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose transition probability is parametrized by an unknown transition core with features of state and action. Despite much recent progress in…

Machine Learning · Statistics 2024-11-19 Taehyun Hwang , Min-hwan Oh

Meta reinforcement learning as task inference

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes…

Machine Learning · Computer Science 2019-10-23 Jan Humplik , Alexandre Galashov , Leonard Hasenclever , Pedro A. Ortega , Yee Whye Teh , Nicolas Heess

Reinforcement Learning in Reward-Mixing MDPs

Learning a near optimal policy in a partially observable system remains an elusive challenge in contemporary reinforcement learning. In this work, we consider episodic reinforcement learning in a reward-mixing Markov decision process (MDP).…

Machine Learning · Computer Science 2022-02-01 Jeongyeol Kwon , Yonathan Efroni , Constantine Caramanis , Shie Mannor

Modular Deep Reinforcement Learning with Temporal Logic Specifications

We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal…

Machine Learning · Computer Science 2019-11-26 Lim Zun Yuan , Mohammadhosein Hasanbeig , Alessandro Abate , Daniel Kroening

A Structure-aware Online Learning Algorithm for Markov Decision Processes

To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider…

Machine Learning · Computer Science 2018-11-29 Arghyadip Roy , Vivek Borkar , Abhay Karandikar , Prasanna Chaporkar