Related papers: Online Attentive Kernel-Based Temporal Difference …

Fast Value Tracking for Deep Reinforcement Learning

Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interacts with their environment. However, existing algorithms often view these problem as static, focusing on point estimates for model…

Machine Learning · Statistics 2024-03-21 Frank Shih , Faming Liang

Learning sparse representations in reinforcement learning

Reinforcement learning (RL) algorithms allow artificial agents to improve their selection of actions to increase rewarding experiences in their environments. Temporal Difference (TD) Learning -- a model-free RL method -- is a leading…

Machine Learning · Computer Science 2019-09-05 Jacob Rafati , David C. Noelle

Discerning Temporal Difference Learning

Temporal difference learning (TD) is a foundational concept in reinforcement learning (RL), aimed at efficiently assessing a policy's value function. TD($\lambda$), a potent variant, incorporates a memory trace to distribute the prediction…

Machine Learning · Computer Science 2024-02-13 Jianfei Ma

RankQ: Offline-to-Online Reinforcement Learning via Self-Supervised Action Ranking

Offline-to-online reinforcement learning (RL) improves sample efficiency by leveraging pre-collected datasets prior to online interaction. A key challenge, however, is learning an accurate critic in large state--action spaces with limited…

Artificial Intelligence · Computer Science 2026-05-21 Andrew Choi , Wei Xu

Kalman Temporal Differences

Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman…

Machine Learning · Computer Science 2014-06-13 Matthieu Geist , Olivier Pietquin

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning

The recent emergence of reinforcement learning has created a demand for robust statistical inference methods for the parameter estimates computed using these algorithms. Existing methods for statistical inference in online learning are…

Machine Learning · Statistics 2022-06-29 Pratik Ramprasad , Yuantong Li , Zhuoran Yang , Zhaoran Wang , Will Wei Sun , Guang Cheng

Online Decision Transformer

Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling. However, any…

Machine Learning · Computer Science 2022-07-14 Qinqing Zheng , Amy Zhang , Aditya Grover

Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning

Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor lambda. Currently the most important application of these methods is to temporal…

Artificial Intelligence · Computer Science 2008-02-03 P. Cichosz

A Comparison Between Decision Transformers and Traditional Offline Reinforcement Learning Algorithms

The field of Offline Reinforcement Learning (RL) aims to derive effective policies from pre-collected datasets without active environment interaction. While traditional offline RL algorithms like Conservative Q-Learning (CQL) and Implicit…

Machine Learning · Computer Science 2025-11-21 Ali Murtaza Caunhye , Asad Jeewa

Reanalysis of Variance Reduced Temporal Difference Learning

Temporal difference (TD) learning is a popular algorithm for policy evaluation in reinforcement learning, but the vanilla TD can substantially suffer from the inherent optimization variance. A variance reduced TD (VRTD) algorithm was…

Machine Learning · Computer Science 2020-01-13 Tengyu Xu , Zhe Wang , Yi Zhou , Yingbin Liang

Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning

Offline goal-conditioned reinforcement learning (GCRL) offers a practical learning paradigm in which goal-reaching policies are trained from abundant state-action trajectory datasets without additional environment interaction. However,…

Machine Learning · Computer Science 2025-11-05 Hongjoon Ahn , Heewoong Choi , Jisu Han , Taesup Moon

Adapting the Function Approximation Architecture in Online Reinforcement Learning

The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide both optimization techniques and architectures for approximating nonlinear…

Machine Learning · Computer Science 2021-06-21 John D. Martin , Joseph Modayil

PID Accelerated Temporal Difference Algorithms

Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algorithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and…

Machine Learning · Computer Science 2024-09-04 Mark Bedaywi , Amin Rakhsha , Amir-massoud Farahmand

Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

The goal of offline reinforcement learning (RL) is to extract a high-performance policy from the fixed datasets, minimizing performance degradation due to out-of-distribution (OOD) samples. Offline model-based RL (MBRL) is a promising…

Machine Learning · Computer Science 2025-05-20 Dongsu Lee , Minhae Kwon

Reinforcement Learning with Action Chunking

We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an…

Machine Learning · Computer Science 2026-05-12 Qiyang Li , Zhiyuan Zhou , Sergey Levine

Offline Reinforcement Learning with Value-based Episodic Memory

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data. Most existing offline RL algorithms use regularization or constraints to suppress extrapolation…

Machine Learning · Computer Science 2021-10-20 Xiaoteng Ma , Yiqin Yang , Hao Hu , Qihan Liu , Jun Yang , Chongjie Zhang , Qianchuan Zhao , Bin Liang

Reward-Relevance-Filtered Linear Offline Reinforcement Learning

This paper studies offline reinforcement learning with linear function approximation in a setting with decision-theoretic, but not estimation sparsity. The structural restrictions of the data-generating process presume that the transitions…

Machine Learning · Statistics 2024-01-24 Angela Zhou

Provably Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning

Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with tremendous potential impact by enabling RL algorithms in many real-world applications. A popular solution to the problem is to infer task identity as…

Machine Learning · Computer Science 2021-10-18 Lanqing Li , Yuanhao Huang , Mingzhe Chen , Siteng Luo , Dijun Luo , Junzhou Huang

CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning

Vision-Language-Action (VLA) models demonstrate significant potential for developing generalized policies in real-world robotic control. This progress inspires researchers to explore fine-tuning these models with Reinforcement Learning…

Robotics · Computer Science 2025-08-05 Dongchi Huang , Zhirui Fang , Tianle Zhang , Yihang Li , Lin Zhao , Chunhe Xia

Online Pre-Training for Offline-to-Online Reinforcement Learning

Offline-to-online reinforcement learning (RL) aims to integrate the complementary strengths of offline and online RL by pre-training an agent offline and subsequently fine-tuning it through online interactions. However, recent studies…

Machine Learning · Computer Science 2025-07-14 Yongjae Shin , Jeonghye Kim , Whiyoung Jung , Sunghoon Hong , Deunsol Yoon , Youngsoo Jang , Geonhyeong Kim , Jongseong Chae , Youngchul Sung , Kanghoon Lee , Woohyung Lim