Related papers: Federated Temporal Difference Learning with Linear…

Finite-Time Performance of Distributed Temporal Difference Learning with Linear Function Approximation

We study the policy evaluation problem in multi-agent reinforcement learning, modeled by a Markov decision process. In this problem, the agents operate in a common environment under a fixed control policy, working together to discover the…

Optimization and Control · Mathematics 2020-01-13 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Federated Stochastic Approximation under Markov Noise and Heterogeneity: Applications in Reinforcement Learning

Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling observations from the environment is usually split across multiple agents. However, transferring these observations from the agents to a central…

Machine Learning · Computer Science 2024-10-22 Sajad Khodadadian , Pranay Sharma , Gauri Joshi , Siva Theja Maguluri

On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations

Federated reinforcement learning (FedRL) enables multiple agents to collaboratively learn a policy without sharing their local trajectories collected during agent-environment interactions. However, in practice, the environments faced by…

Machine Learning · Computer Science 2025-07-18 Guojun Xiong , Shufan Wang , Daniel Jiang , Jian Li

Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling

Federated learning (FL) has recently gained much attention due to its effectiveness in speeding up supervised learning tasks under communication and privacy constraints. However, whether similar speedups can be established for reinforcement…

Machine Learning · Computer Science 2023-05-16 Nicolò Dal Fabbro , Aritra Mitra , George J. Pappas

Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis

Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. However, many existing FedRL algorithms assume that all agents operate in identical…

Machine Learning · Computer Science 2025-06-17 Ali Beikmohammadi , Sarit Khirirat , Peter Richtárik , Sindri Magnússon

Finite-Time Analysis of Asynchronous Multi-Agent TD Learning

Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent reinforcement learning (MARL). In a setting involving $N$ agents, this beneficial effect usually comes in the form of an $N$-fold linear…

Multiagent Systems · Computer Science 2024-07-31 Nicolò Dal Fabbro , Arman Adibi , Aritra Mitra , George J. Pappas

Parameter-Free Federated TD Learning with Markov Noise in Heterogeneous Environments

Federated learning (FL) can dramatically speed up reinforcement learning by distributing exploration and training across multiple agents. It can guarantee an optimal convergence rate that scales linearly in the number of agents, i.e., a…

Machine Learning · Computer Science 2025-10-10 Ankur Naskar , Gugan Thoppe , Utsav Negi , Vijay Gupta

Towards Fast Rates for Federated and Multi-Task Reinforcement Learning

We consider a setting involving $N$ agents, where each agent interacts with an environment modeled as a Markov Decision Process (MDP). The agents' MDPs differ in their reward functions, capturing heterogeneous objectives/tasks. The…

Machine Learning · Computer Science 2024-09-10 Feng Zhu , Robert W. Heath , Aritra Mitra

Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation

We study personalized multi-agent average reward TD learning, in which a collection of agents interacts with different environments and jointly learns their respective value functions. We focus on the setting where there exists a shared…

Machine Learning · Computer Science 2026-03-10 Leo Muxing Wang , Pengkun Yang , Lili Su

Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation for Multi-Agent Reinforcement Learning

We study the policy evaluation problem in multi-agent reinforcement learning. In this problem, a group of agents works cooperatively to evaluate the value function for the global discounted accumulative reward problem, which is composed of…

Optimization and Control · Mathematics 2019-06-04 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization

We study the policy evaluation problem in multi-agent reinforcement learning where a group of agents, with jointly observed states and private local actions and rewards, collaborate to learn the value function of a given policy via local…

Optimization and Control · Mathematics 2021-11-08 Dongsheng Ding , Xiaohan Wei , Zhuoran Yang , Zhaoran Wang , Mihailo R. Jovanović

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Temporal difference learning (TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a Markov decision process. Although TD is one of the most widely used algorithms in reinforcement…

Machine Learning · Computer Science 2018-11-07 Jalaj Bhandari , Daniel Russo , Raghav Singal

Accelerated Distributional Temporal Difference Learning with Linear Function Approximation

In this paper, we study the finite-sample statistical rates of distributional temporal difference (TD) learning with linear function approximation. The purpose of distributional TD learning is to estimate the return distribution of a…

Machine Learning · Statistics 2025-11-18 Kaicheng Jin , Yang Peng , Jiansheng Yang , Zhihua Zhang

Investigating practical linear temporal difference learning

Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new…

Machine Learning · Computer Science 2016-04-01 Adam White , Martha White

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Motivated by the emerging use of multi-agent reinforcement learning (MARL) in engineering applications such as networked robotics, swarming drones, and sensor networks, we investigate the policy evaluation problem in a fully decentralized…

Machine Learning · Computer Science 2020-01-31 Jun Sun , Gang Wang , Georgios B. Giannakis , Qinmin Yang , Zaiyue Yang

Preferential Temporal Difference Learning

Temporal-Difference (TD) learning is a general and very useful tool for estimating the value function of a given policy, which in turn is required to find good policies. Generally speaking, TD learning updates states whenever they are…

Machine Learning · Computer Science 2021-08-24 Nishanth Anand , Doina Precup

Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning

Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a…

Machine Learning · Computer Science 2024-04-16 Chenyu Zhang , Han Wang , Aritra Mitra , James Anderson

Distributed Continual Learning

This work studies the intersection of continual and federated learning, in which independent agents face unique tasks in their environments and incrementally develop and share knowledge. We introduce a mathematical framework capturing the…

Machine Learning · Computer Science 2024-12-24 Long Le , Marcel Hussing , Eric Eaton

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two…

Machine Learning · Computer Science 2019-06-20 Hugo Penedones , Carlos Riquelme , Damien Vincent , Hartmut Maennel , Timothy Mann , Andre Barreto , Sylvain Gelly , Gergely Neu

Primal-Dual Distributed Temporal Difference Learning

The goal of this paper is to study a distributed version of the gradient temporal-difference (GTD) learning algorithm for a class of multi-agent Markov decision processes (MDPs). The temporal-difference (TD) learning is a reinforcement…

Optimization and Control · Mathematics 2020-04-29 Donghwan Lee , Jianghai Hu