English
Related papers

Related papers: A compact, hierarchical Q-function decomposition

200 papers

This paper presents the MAXQ approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an…

Machine Learning · Computer Science 2007-05-23 Thomas G. Dietterich

Many practical reinforcement learning environments have a discrete factored action space that induces a large combinatorial set of actions, thereby posing significant challenges. Existing approaches leverage the regular structure of the…

Machine Learning · Computer Science 2025-05-01 Junkyu Lee , Tian Gao , Elliot Nelson , Miao Liu , Debarun Bhattacharjya , Songtao Lu

Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability.…

Machine Learning · Computer Science 2023-11-07 Wenhao Lu , Xufeng Zhao , Sven Magg , Martin Gromniak , Mengdi Li , Stefan Wermter

Humans decompose novel complex tasks into simpler ones to exploit previously learned skills. Analogously, hierarchical reinforcement learning seeks to leverage lower-level policies for simple tasks to solve complex ones. However, because…

Machine Learning · Computer Science 2022-03-15 Ju-Seung Byun , Andrew Perrault

Reinforcement learning techniques achieved human-level performance in several tasks in the last decade. However, in recent years, the need for interpretability emerged: we want to be able to understand how a system works and the reasons…

Machine Learning · Computer Science 2023-01-13 Leonardo Lucio Custode , Giovanni Iacca

Value-based methods constitute a fundamental methodology in planning and deep reinforcement learning (RL). In this paper, we propose to exploit the underlying structures of the state-action value function, i.e., Q function, for both…

Machine Learning · Computer Science 2020-07-07 Yuzhe Yang , Guo Zhang , Zhi Xu , Dina Katabi

In the past few years, off-policy reinforcement learning methods have shown promising results in their application for robot control. Deep Q-learning, however, still suffers from poor data-efficiency and is susceptible to stochasticity in…

Machine Learning · Computer Science 2020-08-17 Gabriel Kalweit , Maria Huegle , Joschka Boedecker

Reinforcement learning can train policies that effectively perform complex tasks. However for long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and chaining lower-level skills.…

Machine Learning · Computer Science 2022-03-31 Dhruv Shah , Peng Xu , Yao Lu , Ted Xiao , Alexander Toshev , Sergey Levine , Brian Ichter

Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates. Value estimation heavily depends on the stochasticity of environmental dynamics and the quality of reward…

Machine Learning · Computer Science 2019-05-28 Hongyao Tang , Jianye Hao , Guangyong Chen , Pengfei Chen , Zhaopeng Meng , Yaodong Yang , Li Wang

The dominant framework for off-policy multi-goal reinforcement learning involves estimating goal conditioned Q-value function. When learning to achieve multiple goals, data efficiency is intimately connected with the generalization of the…

Artificial Intelligence · Computer Science 2023-06-28 Zhang-Wei Hong , Ge Yang , Pulkit Agrawal

By reusing data throughout training, off-policy deep reinforcement learning algorithms offer improved sample efficiency relative to on-policy approaches. For continuous action spaces, the most popular methods for off-policy learning include…

Machine Learning · Computer Science 2023-12-01 Jared Markowitz , Jesse Silverberg , Gary Collins

Q-Learning is a fundamental off-policy reinforcement learning (RL) algorithm that has the objective of approximating action-value functions in order to learn optimal policies. Nonetheless, it has difficulties in reconciling bias with…

Machine Learning · Computer Science 2024-11-22 Mahammad Humayoo

Q-learning is a stochastic approximation version of the classic value iteration. The literature has established that Q-learning suffers from both maximization bias and slower convergence. Recently, multi-step algorithms have shown practical…

Machine Learning · Computer Science 2024-07-03 Antony Vijesh , Shreyas S R

Hierarchical decomposition of control is unavoidable in large dynamical systems. In reinforcement learning (RL), it is usually solved with subgoals defined at higher policy levels and achieved at lower policy levels. Reaching these goals…

Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons, and standard RL methods provide too few tools to provide insight into the…

Machine Learning · Computer Science 2022-10-24 James MacGlashan , Evan Archer , Alisa Devlic , Takuma Seno , Craig Sherstan , Peter R. Wurman , Peter Stone

Representation learning and exploration are among the key challenges for any deep reinforcement learning agent. In this work, we provide a singular value decomposition based method that can be used to obtain representations that preserve…

Machine Learning · Computer Science 2023-05-03 Yash Chandak , Shantanu Thakoor , Zhaohan Daniel Guo , Yunhao Tang , Remi Munos , Will Dabney , Diana L Borsa

The breakthrough of deep Q-Learning on different types of environments revolutionized the algorithmic design of Reinforcement Learning to introduce more stable and robust algorithms, to that end many extensions to deep Q-Learning algorithm…

Machine Learning · Computer Science 2024-04-16 Mohammed Sabry , Amr M. A. Khalifa

In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal. We propose the use of reward machines (RM) -- Mealy machines used as structured representations…

Multiagent Systems · Computer Science 2021-06-16 Cyrus Neary , Zhe Xu , Bo Wu , Ufuk Topcu

Recent advances in recommender systems have shown that user-system interaction essentially formulates long-term optimization problems, and online reinforcement learning can be adopted to improve recommendation performance. The general…

Information Retrieval · Computer Science 2025-02-04 Xiaobei Wang , Shuchang Liu , Qingpeng Cai , Xiang Li , Lantao Hu , Han li , Guangming Xie

Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to…

Machine Learning · Computer Science 2023-05-04 Shengpu Tang , Maggie Makar , Michael W. Sjoding , Finale Doshi-Velez , Jenna Wiens
‹ Prev 1 2 3 10 Next ›