Related papers: Efficient Transformers in Reinforcement Learning u…

In-context Reinforcement Learning with Algorithm Distillation

We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to…

Machine Learning · Computer Science 2022-10-26 Michael Laskin , Luyu Wang , Junhyuk Oh , Emilio Parisotto , Stephen Spencer , Richie Steigerwald , DJ Strouse , Steven Hansen , Angelos Filos , Ethan Brooks , Maxime Gazeau , Himanshu Sahni , Satinder Singh , Volodymyr Mnih

ELAD: Explanation-Guided Large Language Models Active Distillation

The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs…

Computation and Language · Computer Science 2024-11-21 Yifei Zhang , Bo Pan , Chen Ling , Yuntong Hu , Liang Zhao

Parental Guidance: Efficient Lifelong Learning through Evolutionary Distillation

Developing robotic agents that can perform well in diverse environments while showing a variety of behaviors is a key challenge in AI and robotics. Traditional reinforcement learning (RL) methods often create agents that specialize in…

Robotics · Computer Science 2025-03-25 Octi Zhang , Quanquan Peng , Rosario Scalise , Bryon Boots

Vintix II: Decision Pre-Trained Transformer is a Scalable In-Context Reinforcement Learner

Recent progress in in-context reinforcement learning (ICRL) has demonstrated its potential for training generalist agents that can acquire new tasks directly at inference. Algorithm Distillation (AD) pioneered this paradigm and was…

Machine Learning · Computer Science 2026-04-08 Andrei Polubarov , Lyubaykin Nikita , Alexander Derevyagin , Artyom Grishin , Igor Saprygin , Aleksandr Serkov , Mark Averchenko , Daniil Tikhonov , Maksim Zhdanov , Alexander Nikulin , Ilya Zisman , Albina Klepach , Alexey Zemtsov , Vladislav Kurenkov

MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models

Pretrained language models have led to significant performance gains in many NLP tasks. However, the intensive computing resources to train such models remain an issue. Knowledge distillation alleviates this problem by learning a…

Computation and Language · Computer Science 2020-05-04 Linqing Liu , Huan Wang , Jimmy Lin , Richard Socher , Caiming Xiong

Structured Agent Distillation for Large Language Model

Large language models (LLMs) exhibit strong capabilities as decision-making agents by interleaving reasoning and actions, as seen in ReAct-style frameworks. Yet, their practical deployment is constrained by high inference costs and large…

Machine Learning · Computer Science 2026-05-28 Jun Liu , Zhenglun Kong , Peiyan Dong , Changdi Yang , Tianqi Li , Hao Tang , Geng Yuan , Wei Niu , Wenbin Zhang , Pu Zhao , Xue Lin , Dong Huang , Yanzhi Wang

AMD: Automatic Multi-step Distillation of Large-scale Vision Models

Transformer-based architectures have become the de-facto standard models for diverse vision tasks owing to their superior performance. As the size of the models continues to scale up, model distillation becomes extremely important in…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Cheng Han , Qifan Wang , Sohail A. Dianat , Majid Rabbani , Raghuveer M. Rao , Yi Fang , Qiang Guan , Lifu Huang , Dongfang Liu

ReDiF: Reinforced Distillation for Few Step Diffusion

Distillation addresses the slow sampling problem in diffusion models by creating models with smaller size or fewer steps that approximate the behavior of high-step teachers. In this work, we propose a reinforcement learning based…

Machine Learning · Computer Science 2025-12-30 Amirhossein Tighkhorshid , Zahra Dehghanian , Gholamali Aminian , Chengchun Shi , Hamid R. Rabiee

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference cost of such models motivates distillation into smaller students. Most…

Machine Learning · Computer Science 2026-04-13 Zhaoyang Zhang , Shuli Jiang , Yantao Shen , Yuting Zhang , Dhananjay Ram , Shuo Yang , Zhuowen Tu , Wei Xia , Stefano Soatto

AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes

While knowledge distillation has become a mature field for compressing large language models (LLMs) into smaller ones by aligning their outputs or internal representations, the distillation of LLM-based agents, which involve planning,…

Artificial Intelligence · Computer Science 2025-06-18 Jiahao Qiu , Xinzhe Juan , Yimin Wang , Ling Yang , Xuan Qi , Tongcheng Zhang , Jiacheng Guo , Yifu Lu , Zixin Yao , Hongru Wang , Shilong Liu , Xun Jiang , Liu Leqi , Mengdi Wang

Improving Question Answering Performance Using Knowledge Distillation and Active Learning

Contemporary question answering (QA) systems, including transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Further,…

Computation and Language · Computer Science 2021-09-28 Yasaman Boreshban , Seyed Morteza Mirbostani , Gholamreza Ghassem-Sani , Seyed Abolghasem Mirroshandel , Shahin Amiriparian

TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents

We present a novel approach to knowledge transfer in model-based reinforcement learning, addressing the critical challenge of deploying large world models in resource-constrained environments. Our method efficiently distills a high-capacity…

Machine Learning · Computer Science 2025-07-03 Dmytro Kuzmenko , Nadiya Shvai

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

Pre-trained language models (e.g., BERT (Devlin et al., 2018) and its variants) have achieved remarkable success in varieties of NLP tasks. However, these models usually consist of hundreds of millions of parameters which brings challenges…

Computation and Language · Computer Science 2020-04-07 Wenhui Wang , Furu Wei , Li Dong , Hangbo Bao , Nan Yang , Ming Zhou

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Large-scale vision-language models (VLMs) have recently achieved remarkable multimodal understanding, but their massive size makes them impractical for deployment on mobile or edge devices. This raises the need for compact yet capable VLMs…

Machine Learning · Computer Science 2025-12-30 Byung-Kwan Lee , Yu-Chiang Frank Wang , Ryo Hachiuma

Efficient Active Imitation Learning with Random Network Distillation

Developing agents for complex and underspecified tasks, where no clear objective exists, remains challenging but offers many opportunities. This is especially true in video games, where simulated players (bots) need to play realistically,…

Machine Learning · Computer Science 2025-04-15 Emilien Biré , Anthony Kobanda , Ludovic Denoyer , Rémy Portelas

Learning Efficient Detector with Semi-supervised Adaptive Distillation

Knowledge Distillation (KD) has been used in image classification for model compression. However, rare studies apply this technology on single-stage object detectors. Focal loss shows that the accumulated errors of easily-classified samples…

Computer Vision and Pattern Recognition · Computer Science 2019-01-15 Shitao Tang , Litong Feng , Wenqi Shao , Zhanghui Kuang , Wei Zhang , Yimin Chen

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a…

Artificial Intelligence · Computer Science 2026-05-26 Yinyi Luo , Yiqiao Jin , Weichen Yu , Mengqi Zhang , Srijan Kumar , Xiaoxiao Li , Weijie Xu , Xin Chen , Jindong Wang

Enhancing Reasoning Capabilities in SLMs with Reward Guided Dataset Distillation

The push to compress and impart the proficiency of Large Language Models (LLMs) into more deployable and efficient Small Language Models (SLMs) has benefited from improvements in knowledge distillation (KD) techniques. These techniques…

Artificial Intelligence · Computer Science 2025-07-02 Shreyansh Padarha

Reinforced Multi-Teacher Selection for Knowledge Distillation

In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation…

Computation and Language · Computer Science 2020-12-15 Fei Yuan , Linjun Shou , Jian Pei , Wutao Lin , Ming Gong , Yan Fu , Daxin Jiang

Real-time Policy Distillation in Deep Reinforcement Learning

Policy distillation in deep reinforcement learning provides an effective way to transfer control policies from a larger network to a smaller untrained network without a significant degradation in performance. However, policy distillation is…

Machine Learning · Computer Science 2020-01-01 Yuxiang Sun , Pooyan Fazli