Related papers: Policy Distillation with Selective Input Gradient …

Distilling Policy Distillation

The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents,…

Machine Learning · Computer Science 2019-02-07 Wojciech Marian Czarnecki , Razvan Pascanu , Simon Osindero , Siddhant M. Jayakumar , Grzegorz Swirszcz , Max Jaderberg

A New Framework for Multi-Agent Reinforcement Learning -- Centralized Training and Exploration with Decentralized Execution via Policy Distillation

Deep reinforcement learning (DRL) is a booming area of artificial intelligence. Many practical applications of DRL naturally involve more than one collaborative learners, making it important to study DRL in a multi-agent context. Previous…

Machine Learning · Computer Science 2019-10-22 Gang Chen

Real-time Policy Distillation in Deep Reinforcement Learning

Policy distillation in deep reinforcement learning provides an effective way to transfer control policies from a larger network to a smaller untrained network without a significant degradation in performance. However, policy distillation is…

Machine Learning · Computer Science 2020-01-01 Yuxiang Sun , Pooyan Fazli

"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents

Recent advances in Reinforcement Learning (RL) largely benefit from the inclusion of Deep Neural Networks, boosting the number of novel approaches proposed in the field of Deep Reinforcement Learning (DRL). These techniques demonstrate the…

Machine Learning · Computer Science 2025-07-30 Giovanni Dispoto , Paolo Bonetti , Marcello Restelli

Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments

This paper aims to explore the potential of combining Deep Reinforcement Learning (DRL) with Knowledge Distillation (KD) by distilling various DRL algorithms and studying their distillation effects. By doing so, the computational burden of…

Machine Learning · Computer Science 2024-04-03 Guanlin Meng

Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration

Safe Reinforcement Learning (RL) aims to find a policy that achieves high rewards while satisfying cost constraints. When learning from scratch, safe RL agents tend to be overly conservative, which impedes exploration and restrains the…

Robotics · Computer Science 2023-10-16 Jinning Li , Xinyi Liu , Banghua Zhu , Jiantao Jiao , Masayoshi Tomizuka , Chen Tang , Wei Zhan

In-context Reinforcement Learning with Algorithm Distillation

We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to…

Machine Learning · Computer Science 2022-10-26 Michael Laskin , Luyu Wang , Junhyuk Oh , Emilio Parisotto , Stephen Spencer , Richie Steigerwald , DJ Strouse , Steven Hansen , Angelos Filos , Ethan Brooks , Maxime Gazeau , Himanshu Sahni , Satinder Singh , Volodymyr Mnih

Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression

Recent advancements in reinforcement learning (RL) have led to remarkable achievements in robot locomotion capabilities. However, the complexity and ``black-box'' nature of neural network-based RL policies hinder their interpretability and…

Robotics · Computer Science 2024-03-22 Fernando Acero , Zhibin Li

Distilled Domain Randomization

Deep reinforcement learning is an effective tool to learn robot control policies from scratch. However, these methods are notorious for the enormous amount of required training data which is prohibitively expensive to collect on real…

Machine Learning · Computer Science 2021-12-07 Julien Brosseit , Benedikt Hahner , Fabio Muratore , Michael Gienger , Jan Peters

Policy Distillation

Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve…

Machine Learning · Computer Science 2016-01-08 Andrei A. Rusu , Sergio Gomez Colmenarejo , Caglar Gulcehre , Guillaume Desjardins , James Kirkpatrick , Razvan Pascanu , Volodymyr Mnih , Koray Kavukcuoglu , Raia Hadsell

Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning

Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller…

Machine Learning · Computer Science 2025-11-18 Senne Deproost , Dennis Steckelmacher , Ann Nowé

Automaton Distillation: Neuro-Symbolic Transfer Learning for Deep Reinforcement Learning

Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes. However, deep RL methods have two weaknesses: collecting the amount of agent experience required for practical RL problems is…

Machine Learning · Computer Science 2024-11-11 Suraj Singireddy , Precious Nwaorgu , Andre Beckus , Aden McKinney , Chinwendu Enyioha , Sumit Kumar Jha , George K. Atia , Alvaro Velasquez

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control

Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. An open problem in this setting is that of developing…

Machine Learning · Computer Science 2018-02-14 Glen Berseth , Cheng Xie , Paul Cernek , Michiel Van de Panne

Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks

Dataset distillation, a training-aware data compression technique, has recently attracted increasing attention as an effective tool for mitigating costs of optimization and data storage. However, progress remains largely empirical.…

Machine Learning · Computer Science 2026-03-31 Yuri Kinoshita , Naoki Nishikawa , Taro Toyoizumi

Neural Logic Reinforcement Learning

Deep reinforcement learning (DRL) has achieved significant breakthroughs in various tasks. However, most DRL algorithms suffer a problem of generalizing the learned policy which makes the learning performance largely affected even by minor…

Machine Learning · Computer Science 2019-07-11 Zhengyao Jiang , Shan Luo

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Though convolutional neural networks are widely used in different tasks, lack of generalization capability in the absence of sufficient and representative data is one of the challenges that hinder their practical application. In this paper,…

Computer Vision and Pattern Recognition · Computer Science 2021-07-07 Yufei Wang , Haoliang Li , Lap-pui Chau , Alex C. Kot

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Recent advances in robotic foundation models have enabled the development of generalist policies that can adapt to diverse tasks. While these models show impressive flexibility, their performance heavily depends on the quality of their…

Robotics · Computer Science 2024-12-16 Charles Xu , Qiyang Li , Jianlan Luo , Sergey Levine

Model-Free DRL Control for Power Inverters: From Policy Learning to Real-Time Implementation via Knowledge Distillation

In response to the trade-off between control performance and computational burden hindering the deployment of Deep Reinforcement Learning (DRL) in power inverters, this paper presents a novel model-free control framework leveraging policy…

Systems and Control · Electrical Eng. & Systems 2026-03-10 Yang Yang , Chenggang Cui , Xitong Niu , Jiaming Liu , Chuanlin Zhang

Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence

Risk-sensitive reinforcement learning (RL) is crucial for maintaining reliable performance in high-stakes applications. While traditional RL methods aim to learn a point estimate of the random cumulative cost, distributional RL (DRL) seeks…

Machine Learning · Computer Science 2025-02-03 Minheng Xiao , Xian Yu , Lei Ying

Refined Policy Distillation: From VLA Generalists to RL Experts

Vision-Language-Action Models (VLAs) have demonstrated remarkable generalization capabilities in real-world experiments. However, their success rates are often not on par with expert policies, and they require fine-tuning when the setup…

Robotics · Computer Science 2025-08-05 Tobias Jülg , Wolfram Burgard , Florian Walter