Related papers: A compact, hierarchical Q-function decomposition

Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition

This paper presents the MAXQ approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an…

Machine Learning · Computer Science 2007-05-23 Thomas G. Dietterich

Q-function Decomposition with Intervention Semantics with Factored Action Spaces

Many practical reinforcement learning environments have a discrete factored action space that induces a large combinatorial set of actions, thereby posing significant challenges. Existing approaches leverage the regular structure of the…

Machine Learning · Computer Science 2025-05-01 Junkyu Lee , Tian Gao , Elliot Nelson , Miao Liu , Debarun Bhattacharjya , Songtao Lu

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability.…

Machine Learning · Computer Science 2023-11-07 Wenhao Lu , Xufeng Zhao , Sven Magg , Martin Gromniak , Mengdi Li , Stefan Wermter

Training Transition Policies via Distribution Matching for Complex Tasks

Humans decompose novel complex tasks into simpler ones to exploit previously learned skills. Analogously, hierarchical reinforcement learning seeks to leverage lower-level policies for simple tasks to solve complex ones. However, because…

Machine Learning · Computer Science 2022-03-15 Ju-Seung Byun , Andrew Perrault

Evolutionary learning of interpretable decision trees

Reinforcement learning techniques achieved human-level performance in several tasks in the last decade. However, in recent years, the need for interpretability emerged: we want to be able to understand how a system works and the reasons…

Machine Learning · Computer Science 2023-01-13 Leonardo Lucio Custode , Giovanni Iacca

Harnessing Structures for Value-Based Planning and Reinforcement Learning

Value-based methods constitute a fundamental methodology in planning and deep reinforcement learning (RL). In this paper, we propose to exploit the underlying structures of the state-action value function, i.e., Q function, for both…

Machine Learning · Computer Science 2020-07-07 Yuzhe Yang , Guo Zhang , Zhi Xu , Dina Katabi

Composite Q-learning: Multi-scale Q-function Decomposition and Separable Optimization

In the past few years, off-policy reinforcement learning methods have shown promising results in their application for robot control. Deep Q-learning, however, still suffers from poor data-efficiency and is susceptible to stochasticity in…

Machine Learning · Computer Science 2020-08-17 Gabriel Kalweit , Maria Huegle , Joschka Boedecker

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Reinforcement learning can train policies that effectively perform complex tasks. However for long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and chaining lower-level skills.…

Machine Learning · Computer Science 2022-03-31 Dhruv Shah , Peng Xu , Yao Lu , Ted Xiao , Alexander Toshev , Sergey Levine , Brian Ichter

Disentangling Dynamics and Returns: Value Function Decomposition with Future Prediction

Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates. Value estimation heavily depends on the stochasticity of environmental dynamics and the quality of reward…

Machine Learning · Computer Science 2019-05-28 Hongyao Tang , Jianye Hao , Guangyong Chen , Pengfei Chen , Zhaopeng Meng , Yaodong Yang , Li Wang

Bilinear value networks

The dominant framework for off-policy multi-goal reinforcement learning involves estimating goal conditioned Q-value function. When learning to achieve multiple goals, data efficiency is intimately connected with the generalization of the…

Artificial Intelligence · Computer Science 2023-06-28 Zhang-Wei Hong , Ge Yang , Pulkit Agrawal

Handling Cost and Constraints with Off-Policy Deep Reinforcement Learning

By reusing data throughout training, off-policy deep reinforcement learning algorithms offer improved sample efficiency relative to on-policy approaches. For continuous action spaces, the most popular methods for off-policy learning include…

Machine Learning · Computer Science 2023-12-01 Jared Markowitz , Jesse Silverberg , Gary Collins

Time-Scale Separation in Q-Learning: Extending TD($\triangle$) for Action-Value Function Decomposition

Q-Learning is a fundamental off-policy reinforcement learning (RL) algorithm that has the objective of approximating action-value functions in order to learn optimal policies. Nonetheless, it has difficulties in reconciling bias with…

Machine Learning · Computer Science 2024-11-22 Mahammad Humayoo

Two-Step Q-Learning

Q-learning is a stochastic approximation version of the classic value iteration. The literature has established that Q-learning suffers from both maximization bias and slower convergence. Recently, multi-step algorithms have shown practical…

Machine Learning · Computer Science 2024-07-03 Antony Vijesh , Shreyas S R

Emergency action termination for immediate reaction in hierarchical reinforcement learning

Hierarchical decomposition of control is unavoidable in large dynamical systems. In reinforcement learning (RL), it is usually solved with subgoals defined at higher policy levels and achieved at lower policy levels. Reaching these goals…

Machine Learning · Computer Science 2022-11-14 Michał Bortkiewicz , Jakub Łyskawa , Paweł Wawrzyński , Mateusz Ostaszewski , Artur Grudkowski , Tomasz Trzciński

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons, and standard RL methods provide too few tools to provide insight into the…

Machine Learning · Computer Science 2022-10-24 James MacGlashan , Evan Archer , Alisa Devlic , Takuma Seno , Craig Sherstan , Peter R. Wurman , Peter Stone

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Representation learning and exploration are among the key challenges for any deep reinforcement learning agent. In this work, we provide a singular value decomposition based method that can be used to obtain representations that preserve…

Machine Learning · Computer Science 2023-05-03 Yash Chandak , Shantanu Thakoor , Zhaohan Daniel Guo , Yunhao Tang , Remi Munos , Will Dabney , Diana L Borsa

On the Reduction of Variance and Overestimation of Deep Q-Learning

The breakthrough of deep Q-Learning on different types of environments revolutionized the algorithmic design of Reinforcement Learning to introduce more stable and robust algorithms, to that end many extensions to deep Q-Learning algorithm…

Machine Learning · Computer Science 2024-04-16 Mohammed Sabry , Amr M. A. Khalifa

Reward Machines for Cooperative Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal. We propose the use of reward machines (RM) -- Mealy machines used as structured representations…

Multiagent Systems · Computer Science 2021-06-16 Cyrus Neary , Zhe Xu , Bo Wu , Ufuk Topcu

Value Function Decomposition in Markov Recommendation Process

Recent advances in recommender systems have shown that user-system interaction essentially formulates long-term optimization problems, and online reinforcement learning can be adopted to improve recommendation performance. The general…

Information Retrieval · Computer Science 2025-02-04 Xiaobei Wang , Shuchang Liu , Qingpeng Cai , Xiang Li , Lantao Hu , Han li , Guangming Xie

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare

Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to…

Machine Learning · Computer Science 2023-05-04 Shengpu Tang , Maggie Makar , Michael W. Sjoding , Finale Doshi-Velez , Jenna Wiens