English
Related papers

Related papers: Flow Q-Learning

200 papers

Offline safe reinforcement learning (RL) seeks reward-maximizing policies from static datasets under strict safety constraints. Existing methods often rely on soft expected-cost objectives or iterative generative inference, which can be…

Machine Learning · Computer Science 2026-03-17 Mumuksh Tayal , Manan Tayal , Ravi Prakash

Diffusion Q-Learning (DQL) has established diffusion policies as a high-performing paradigm for offline reinforcement learning, but its reliance on multi-step denoising for action generation renders both training and inference slow and…

Machine Learning · Computer Science 2026-02-25 Thanh Nguyen , Chang D. Yoo

Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL. Standard RL methods often perform poorly in this regime due to the function…

Machine Learning · Computer Science 2023-08-29 Zhendong Wang , Jonathan J Hunt , Mingyuan Zhou

Expressive policies based on flow-matching have been successfully applied in reinforcement learning (RL) more recently due to their ability to model complex action distributions from offline data. These algorithms build on standard policy…

Machine Learning · Computer Science 2026-02-04 Mingxuan Li , Junzhe Zhang , Elias Bareinboim

We present \textbf{FlowRL}, a novel framework for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. We argue that in addition to training signals, enhancing the…

Machine Learning · Computer Science 2025-06-17 Lei Lv , Yunfei Li , Yu Luo , Fuchun Sun , Tao Kong , Jiafeng Xu , Xiao Ma

There is growing interest in utilizing flow-based models as decision-making policies in reinforcement learning due to their high expressive capacity. However, effectively leveraging this expressivity for value maximization remains…

Machine Learning · Computer Science 2026-05-14 JaeHyeok Doo , Byeongguk Jeon , Seonghyeon Ye , Kimin Lee , Minjoon Seo

Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behavior policy that collected the dataset, while at the same time minimizing the deviation from the behavior policy so as to…

Machine Learning · Computer Science 2021-10-13 Ilya Kostrikov , Ashvin Nair , Sergey Levine

Generative policies based on expressive model classes, such as diffusion and flow matching, are well-suited to complex control problems with highly multimodal action distributions. Their expressivity, however, comes at a significant…

Machine Learning · Computer Science 2026-05-13 Christos Ziakas , Alessandra Russo , Avishek Joey Bose

We propose Flow-Anchored Noise-conditioned Q-Learning (FAN), a highly efficient and high-performing offline reinforcement learning (RL) algorithm. Recent work has shown that expressive flow policies and distributional critics improve…

Machine Learning · Computer Science 2026-05-29 Sungyoung Lee , Dohyeong Kim , Eshan Balachandar , Zelal Su Mustafaoglu , Keshav Pingali

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected,…

Machine Learning · Computer Science 2020-08-20 Aviral Kumar , Aurick Zhou , George Tucker , Sergey Levine

We introduce a one-step generative policy for offline reinforcement learning that maps noise directly to actions via a residual reformulation of MeanFlow, making it compatible with Q-learning. While one-step Gaussian policies enable fast…

Machine Learning · Computer Science 2025-11-18 Zeyuan Wang , Da Li , Yulin Chen , Ye Shi , Liang Bai , Tianyuan Yu , Yanwei Fu

Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous action spaces. To address a broader range of…

Machine Learning · Computer Science 2026-05-14 Fairoz Nower Khan , Nabuat Zaman Nahim , Ruiquan Huang , Haibo Yang , Peizhong Ju

Due to its training stability and strong expression, the diffusion model has attracted considerable attention in offline reinforcement learning. However, several challenges have also come with it: 1) The demand for a large number of…

Machine Learning · Computer Science 2024-01-25 Yuhui Chen , Haoran Li , Dongbin Zhao

Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies from a static dataset, without the need for further environment interactions. However, a key challenge in offline RL lies in effectively stitching…

Machine Learning · Computer Science 2023-09-14 Siddarth Venkatraman , Shivesh Khaitan , Ravi Tej Akella , John Dolan , Jeff Schneider , Glen Berseth

Offline reinforcement learning (RL) leverages pre-collected datasets to train optimal policies. Diffusion Q-Learning (DQL), introducing diffusion models as a powerful and expressive policy class, significantly boosts the performance of…

Machine Learning · Computer Science 2024-11-04 Tianyu Chen , Zhendong Wang , Mingyuan Zhou

Diffusion and flow matching policies offer expressive, multimodal action modeling, yet they are frequently unstable in online reinforcement learning (RL) due to intractable likelihoods and gradients propagating through long sampling chains.…

Machine Learning · Computer Science 2026-03-10 Chubin Zhang , Zhenglin Wan , Feng Chen , Fuchao Yang , Lang Feng , Yaxin Zhou , Xingrui Yu , Yang You , Ivor Tsang , Bo An

Offline reinforcement learning (RL) aims to learn optimal policies from offline datasets, where the parameterization of policies is crucial but often overlooked. Recently, Diffsuion-QL significantly boosts the performance of offline RL by…

Machine Learning · Computer Science 2023-10-27 Bingyi Kang , Xiao Ma , Chao Du , Tianyu Pang , Shuicheng Yan

Offline goal-conditioned reinforcement learning (GCRL) is a practical reinforcement learning paradigm that aims to learn goal-conditioned policies from reward-free offline data. Despite recent advances in hierarchical architectures such as…

Machine Learning · Computer Science 2026-04-13 Zhiqiang Dong , Teng Pang , Rongjian Xu , Guoqiang Wu

Offline reinforcement learning (RL) enables policy learning from fixed datasets without further environment interaction, making it particularly valuable in high-risk or costly domains. Extreme $Q$-Learning (XQL) is a recent offline RL…

Machine Learning · Computer Science 2026-04-15 Xinming Gao , Shangzhe Li , Yujin Cai , Wenwu Yu

Reinforcement learning (RL) is a powerful paradigm for learning to make sequences of decisions. However, RL has yet to be fully leveraged in robotics, principally due to its lack of scalability. Offline RL offers a promising avenue by…

Machine Learning · Computer Science 2025-10-10 Nicolas Espinosa-Dice , Kiante Brantley , Wen Sun
‹ Prev 1 2 3 10 Next ›