Related papers: Quantile Reinforcement Learning

Distributional Reinforcement Learning with Quantile Regression

In reinforcement learning an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the…

Artificial Intelligence · Computer Science 2017-10-30 Will Dabney , Mark Rowland , Marc G. Bellemare , Rémi Munos

Quantile-Based Policy Optimization for Reinforcement Learning

Classical reinforcement learning (RL) aims to optimize the expected cumulative rewards. In this work, we consider the RL setting where the goal is to optimize the quantile of the cumulative rewards. We parameterize the policy controlling…

Machine Learning · Computer Science 2022-02-17 Jinyang Jiang , Jiaqiao Hu , Yijie Peng

Quantum Algorithms for Reinforcement Learning with a Generative Model

Reinforcement learning studies how an agent should interact with an environment to maximize its cumulative reward. A standard way to study this question abstractly is to ask how many samples an agent needs from the environment to learn an…

Quantum Physics · Physics 2021-12-21 Daochen Wang , Aarthi Sundaram , Robin Kothari , Ashish Kapoor , Martin Roetteler

Reinforcement Learning by Comparing Immediate Reward

This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate rewards using a variation of Q-Learning algorithm. Unlike the conventional Q-Learning, the proposed algorithm compares current reward with…

Machine Learning · Computer Science 2010-09-15 Punit Pandey , Deepshikha Pandey , Shishir Kumar

Reinforcement Learning in Economics and Finance

Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal…

Theoretical Economics · Economics 2020-03-24 Arthur Charpentier , Romuald Elie , Carl Remlinger

Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms

Classical reinforcement learning (RL) aims to optimize the expected cumulative reward. In this work, we consider the RL setting where the goal is to optimize the quantile of the cumulative reward. We parameterize the policy controlling…

Machine Learning · Computer Science 2023-05-15 Jinyang Jiang , Jiaqiao Hu , Yijie Peng

Quantum Reinforcement Learning via Policy Iteration

Quantum computing has shown the potential to substantially speed up machine learning applications, in particular for supervised and unsupervised learning. Reinforcement learning, on the other hand, has become essential for solving many…

Quantum Physics · Physics 2023-11-28 El Amine Cherrat , Iordanis Kerenidis , Anupam Prakash

Reinforcement Learning

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e.g., board games, video games or autonomous vehicles. In such problems, an agent faces a sequential decision-making…

Machine Learning · Computer Science 2020-06-16 Olivier Buffet , Olivier Pietquin , Paul Weng

A State Augmentation based approach to Reinforcement Learning from Human Preferences

Reinforcement Learning has suffered from poor reward specification, and issues for reward hacking even in simple enough domains. Preference Based Reinforcement Learning attempts to solve the issue by utilizing binary feedbacks on queried…

Artificial Intelligence · Computer Science 2023-02-20 Mudit Verma , Subbarao Kambhampati

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

We study the problem of temporal-difference-based policy evaluation in reinforcement learning. In particular, we analyse the use of a distributional reinforcement learning algorithm, quantile temporal-difference learning (QTD), for this…

Machine Learning · Computer Science 2023-05-31 Mark Rowland , Yunhao Tang , Clare Lyle , Rémi Munos , Marc G. Bellemare , Will Dabney

Partial Policy Gradients for RL in LLMs

Reinforcement learning is a framework for learning to act sequentially in an unknown environment. We propose a natural approach for modeling policy structure in policy gradients. The key idea is to optimize for a subset of future rewards:…

Machine Learning · Computer Science 2026-03-09 Puneet Mathur , Branislav Kveton , Subhojyoti Mukherjee , Viet Dac Lai

Reinforcement Learning with an Abrupt Model Change

The problem of reinforcement learning is considered where the environment or the model undergoes a change. An algorithm is proposed that an agent can apply in such a problem to achieve the optimal long-time discounted reward. The algorithm…

Systems and Control · Electrical Eng. & Systems 2023-04-25 Wuxia Chen , Taposh Banerjee , Jemin George , Carl Busart

Quantum policy gradient algorithms

Understanding the power and limitations of quantum access to data in machine learning tasks is primordial to assess the potential of quantum computing in artificial intelligence. Previous works have already shown that speed-ups in learning…

Quantum Physics · Physics 2023-07-21 Sofiene Jerbi , Arjan Cornelissen , Māris Ozols , Vedran Dunjko

Reward-Conditioned Policies

Reinforcement learning offers the promise of automating the acquisition of complex behavioral skills. However, compared to commonly used and well-understood supervised learning methods, reinforcement learning algorithms can be brittle,…

Machine Learning · Computer Science 2020-01-01 Aviral Kumar , Xue Bin Peng , Sergey Levine

Reinforcement Learning with Quasi-Hyperbolic Discounting

Reinforcement learning has traditionally been studied with exponential discounting or the average reward setup, mainly due to their mathematical tractability. However, such frameworks fall short of accurately capturing human behavior, which…

Machine Learning · Computer Science 2024-09-18 S. R. Eshwar , Mayank Motwani , Nibedita Roy , Gugan Thoppe

Counterfactually Safe Reinforcement Learning

Reinforcement learning algorithms are generally designed to maximize the expected return across a population. However, a policy that is optimal on average may be suboptimal for certain individuals, leading to potential safety concerns. To…

Machine Learning · Statistics 2026-05-26 Jingyi Li , Peng Wu , Chengchun Shi

Generalizable control for quantum parameter estimation through reinforcement learning

Measurement and estimation of parameters are essential for science and engineering, where one of the main quests is to find systematic schemes that can achieve high precision. While conventional schemes for quantum parameter estimation…

Quantum Physics · Physics 2021-04-29 Han Xu , Junning Li , Liqiang Liu , Yu Wang , Haidong Yuan , Xin Wang

Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning

When multiple agents learn in a decentralized manner, the environment appears non-stationary from the perspective of an individual agent due to the exploration and learning of the other agents. Recently proposed deep multi-agent…

Machine Learning · Computer Science 2020-06-16 Xueguang Lyu , Christopher Amato

A Reinforcement Learning approach for Quantum State Engineering

Machine learning (ML) has become an attractive tool in information processing, however few ML algorithms have been successfully applied in the quantum domain. We show here how classical reinforcement learning (RL) could be used as a tool…

Quantum Physics · Physics 2020-06-02 Jelena Mackeprang , Durga Bhaktavatsala Rao Dasari , Jörg Wrachtrup

Learning Policies through Quantile Regression

Policy gradient based reinforcement learning algorithms coupled with neural networks have shown success in learning complex policies in the model free continuous action space control setting. However, explicitly parameterized policies are…

Machine Learning · Computer Science 2019-09-30 Oliver Richter , Roger Wattenhofer