English
Related papers

Related papers: Offline Reinforcement Learning with Implicit Q-Lea…

200 papers

Most offline reinforcement learning (RL) methods suffer from the trade-off between improving the policy to surpass the behavior policy and constraining the policy to limit the deviation from the behavior policy as computing $Q$-values using…

Machine Learning · Computer Science 2023-03-29 Haoran Xu , Li Jiang , Jianxiong Li , Zhuoran Yang , Zhaoran Wang , Victor Wai Kin Chan , Xianyuan Zhan

Offline reinforcement learning (RL) tries to learn the near-optimal policy with recorded offline experience without online exploration. Current offline RL research includes: 1) generative modeling, i.e., approximating a policy using fixed…

Machine Learning · Computer Science 2021-06-23 Hua Wei , Deheng Ye , Zhao Liu , Hao Wu , Bo Yuan , Qiang Fu , Wei Yang , Zhenhui Li

Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning…

Sample efficiency is critical when applying learning-based methods to robotic manipulation due to the high cost of collecting expert demonstrations and the challenges of on-robot policy learning through online Reinforcement Learning (RL).…

Machine Learning · Computer Science 2024-06-21 Arsh Tangri , Ondrej Biza , Dian Wang , David Klee , Owen Howell , Robert Platt

The objective of offline RL is to learn optimal policies when a fixed exploratory demonstrations data-set is available and sampling additional observations is impossible (typically if this operation is either costly or rises ethical…

Machine Learning · Computer Science 2021-06-10 Firas Jarboui , Vianney Perchet

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected,…

Machine Learning · Computer Science 2020-08-20 Aviral Kumar , Aurick Zhou , George Tucker , Sergey Levine

Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment. However, datasets collected by humans in…

Machine Learning · Computer Science 2024-03-12 Rui Yang , Han Zhong , Jiawei Xu , Amy Zhang , Chongjie Zhang , Lei Han , Tong Zhang

Implicit Q-learning (IQL) serves as a strong baseline for offline RL, which learns the value function using only dataset actions through quantile regression. However, it is unclear how to recover the implicit policy from the learned…

Machine Learning · Computer Science 2025-11-06 Longxiang He , Li Shen , Xueqian Wang

The core challenge of offline reinforcement learning (RL) is dealing with the (potentially catastrophic) extrapolation error induced by the distribution shift between the history dataset and the desired policy. A large portion of prior work…

Machine Learning · Computer Science 2023-07-27 Laixi Shi , Robert Dadashi , Yuejie Chi , Pablo Samuel Castro , Matthieu Geist

Offline Reinforcement Learning (RL) faces a fundamental challenge of extrapolation errors caused by out-of-distribution (OOD) actions. Implicit Q-Learning (IQL) employs expectile regression to achieve in-sample learning. Nevertheless, IQL…

Machine Learning · Computer Science 2026-02-03 Xinchen Han , Hossam Afifi , Michel Marot

Offline reinforcement learning (offline RL), which aims to find an optimal policy from a previously collected static dataset, bears algorithmic difficulties due to function approximation errors from out-of-distribution (OOD) data points. To…

Machine Learning · Computer Science 2021-10-06 Gaon An , Seungyong Moon , Jang-Hyun Kim , Hyun Oh Song

We study offline reinforcement learning of style-conditioned policies using explicit style supervision via subtrajectory labeling functions. In this setting, aligning style with high task performance is particularly challenging due to…

Machine Learning · Computer Science 2026-02-02 Mathieu Petitbois , Rémy Portelas , Sylvain Lamprier

Offline reinforcement learning (RL) aims to learn a policy from a static dataset without further interactions with the environment. Collecting sufficiently large datasets for offline RL is exhausting since this data collection requires…

Artificial Intelligence · Computer Science 2025-10-22 Jongchan Park , Mingyu Park , Donghwan Lee

Offline reinforcement learning (RL) defines the task of learning from a static logged dataset without continually interacting with the environment. The distribution shift between the learned policy and the behavior policy makes it necessary…

Machine Learning · Computer Science 2024-02-22 Jiafei Lyu , Xiaoteng Ma , Xiu Li , Zongqing Lu

Offline inverse reinforcement learning (IRL) aims to recover a reward function that explains expert behavior using only fixed demonstration data, without any additional online interaction. We propose BiCQL-ML, a policy-free offline IRL…

Machine Learning · Computer Science 2025-12-01 Junsung Park

Offline reinforcement learning (RL) looks at learning how to optimally solve tasks using a fixed dataset of interactions from the environment. Many off-policy algorithms developed for online learning struggle in the offline setting as they…

Machine Learning · Computer Science 2025-03-18 Natinael Solomon Neggatu , Jeremie Houssineau , Giovanni Montana

Offline reinforcement learning (RL) allows for the training of competent agents from offline datasets without any interaction with the environment. Online finetuning of such offline models can further improve performance. But how should we…

Machine Learning · Computer Science 2023-03-31 Yicheng Luo , Jackie Kay , Edward Grefenstette , Marc Peter Deisenroth

We consider reinforcement learning (RL) methods in offline domains without additional online data collection, such as mobile health applications. Most of existing policy optimization algorithms in the computer science literature are…

Machine Learning · Statistics 2022-07-28 Chengchun Shi , Shikai Luo , Yuan Le , Hongtu Zhu , Rui Song

Supervised imitation-based approaches are often favored over off-policy reinforcement learning approaches for learning policies offline, since their straightforward optimization objective makes them computationally efficient and stable to…

Machine Learning · Computer Science 2025-12-30 Adam Jelley , Trevor McInroe , Sam Devlin , Amos Storkey

Effective offline RL methods require properly handling out-of-distribution actions. Implicit Q-learning (IQL) addresses this by training a Q-function using only dataset actions through a modified Bellman backup. However, it is unclear which…

Machine Learning · Computer Science 2023-05-23 Philippe Hansen-Estruch , Ilya Kostrikov , Michael Janner , Jakub Grudzien Kuba , Sergey Levine
‹ Prev 1 2 3 10 Next ›