English
Related papers

Related papers: Online Attentive Kernel-Based Temporal Difference …

200 papers

Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interacts with their environment. However, existing algorithms often view these problem as static, focusing on point estimates for model…

Machine Learning · Statistics 2024-03-21 Frank Shih , Faming Liang

Reinforcement learning (RL) algorithms allow artificial agents to improve their selection of actions to increase rewarding experiences in their environments. Temporal Difference (TD) Learning -- a model-free RL method -- is a leading…

Machine Learning · Computer Science 2019-09-05 Jacob Rafati , David C. Noelle

Temporal difference learning (TD) is a foundational concept in reinforcement learning (RL), aimed at efficiently assessing a policy's value function. TD($\lambda$), a potent variant, incorporates a memory trace to distribute the prediction…

Machine Learning · Computer Science 2024-02-13 Jianfei Ma

Offline-to-online reinforcement learning (RL) improves sample efficiency by leveraging pre-collected datasets prior to online interaction. A key challenge, however, is learning an accurate critic in large state--action spaces with limited…

Artificial Intelligence · Computer Science 2026-05-21 Andrew Choi , Wei Xu

Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman…

Machine Learning · Computer Science 2014-06-13 Matthieu Geist , Olivier Pietquin

The recent emergence of reinforcement learning has created a demand for robust statistical inference methods for the parameter estimates computed using these algorithms. Existing methods for statistical inference in online learning are…

Machine Learning · Statistics 2022-06-29 Pratik Ramprasad , Yuantong Li , Zhuoran Yang , Zhaoran Wang , Will Wei Sun , Guang Cheng

Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling. However, any…

Machine Learning · Computer Science 2022-07-14 Qinqing Zheng , Amy Zhang , Aditya Grover

Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor lambda. Currently the most important application of these methods is to temporal…

Artificial Intelligence · Computer Science 2008-02-03 P. Cichosz

The field of Offline Reinforcement Learning (RL) aims to derive effective policies from pre-collected datasets without active environment interaction. While traditional offline RL algorithms like Conservative Q-Learning (CQL) and Implicit…

Machine Learning · Computer Science 2025-11-21 Ali Murtaza Caunhye , Asad Jeewa

Temporal difference (TD) learning is a popular algorithm for policy evaluation in reinforcement learning, but the vanilla TD can substantially suffer from the inherent optimization variance. A variance reduced TD (VRTD) algorithm was…

Machine Learning · Computer Science 2020-01-13 Tengyu Xu , Zhe Wang , Yi Zhou , Yingbin Liang

Offline goal-conditioned reinforcement learning (GCRL) offers a practical learning paradigm in which goal-reaching policies are trained from abundant state-action trajectory datasets without additional environment interaction. However,…

Machine Learning · Computer Science 2025-11-05 Hongjoon Ahn , Heewoong Choi , Jisu Han , Taesup Moon

The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide both optimization techniques and architectures for approximating nonlinear…

Machine Learning · Computer Science 2021-06-21 John D. Martin , Joseph Modayil

Long-horizon tasks, which have a large discount factor, pose a challenge for most conventional reinforcement learning (RL) algorithms. Algorithms such as Value Iteration and Temporal Difference (TD) learning have a slow convergence rate and…

Machine Learning · Computer Science 2024-09-04 Mark Bedaywi , Amin Rakhsha , Amir-massoud Farahmand

The goal of offline reinforcement learning (RL) is to extract a high-performance policy from the fixed datasets, minimizing performance degradation due to out-of-distribution (OOD) samples. Offline model-based RL (MBRL) is a promising…

Machine Learning · Computer Science 2025-05-20 Dongsu Lee , Minhae Kwon

We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an…

Machine Learning · Computer Science 2026-05-12 Qiyang Li , Zhiyuan Zhou , Sergey Levine

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data. Most existing offline RL algorithms use regularization or constraints to suppress extrapolation…

Machine Learning · Computer Science 2021-10-20 Xiaoteng Ma , Yiqin Yang , Hao Hu , Qihan Liu , Jun Yang , Chongjie Zhang , Qianchuan Zhao , Bin Liang

This paper studies offline reinforcement learning with linear function approximation in a setting with decision-theoretic, but not estimation sparsity. The structural restrictions of the data-generating process presume that the transitions…

Machine Learning · Statistics 2024-01-24 Angela Zhou

Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with tremendous potential impact by enabling RL algorithms in many real-world applications. A popular solution to the problem is to infer task identity as…

Machine Learning · Computer Science 2021-10-18 Lanqing Li , Yuanhao Huang , Mingzhe Chen , Siteng Luo , Dijun Luo , Junzhou Huang

Vision-Language-Action (VLA) models demonstrate significant potential for developing generalized policies in real-world robotic control. This progress inspires researchers to explore fine-tuning these models with Reinforcement Learning…

Robotics · Computer Science 2025-08-05 Dongchi Huang , Zhirui Fang , Tianle Zhang , Yihang Li , Lin Zhao , Chunhe Xia

Offline-to-online reinforcement learning (RL) aims to integrate the complementary strengths of offline and online RL by pre-training an agent offline and subsequently fine-tuning it through online interactions. However, recent studies…

‹ Prev 1 2 3 10 Next ›