Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

Qingfeng Lan; Yangchen Pan; Jun Luo; A. Rupam Mahmood

Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

Machine Learning 2023-04-12 v5 Artificial Intelligence

Authors: Qingfeng Lan , Yangchen Pan , Jun Luo , A. Rupam Mahmood

Abstract

Artificial neural networks are promising for general function approximation but challenging to train on non-independent or non-identically distributed data due to catastrophic forgetting. The experience replay buffer, a standard component in deep reinforcement learning, is often used to reduce forgetting and improve sample efficiency by storing experiences in a large buffer and using them for training later. However, a large replay buffer results in a heavy memory burden, especially for onboard and edge devices with limited memory capacities. We propose memory-efficient reinforcement learning algorithms based on the deep Q-network algorithm to alleviate this problem. Our algorithms reduce forgetting and maintain high sample efficiency by consolidating knowledge from the target Q-network to the current Q-network. Compared to baseline methods, our algorithms achieve comparable or better performance in both feature-based and image-based tasks while easing the burden of large experience replay buffers.

Keywords

continual learning reinforcement learning neural network training

Cite

@article{arxiv.2205.10868,
  title  = {Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation},
  author = {Qingfeng Lan and Yangchen Pan and Jun Luo and A. Rupam Mahmood},
  journal= {arXiv preprint arXiv:2205.10868},
  year   = {2023}
}

Comments

TMLR camera-ready

Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

Abstract

Keywords

Cite

Comments

Related papers