Related papers: Knowledge Transfer in Model-Based Reinforcement Le…

TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents

We present a novel approach to knowledge transfer in model-based reinforcement learning, addressing the critical challenge of deploying large world models in resource-constrained environments. Our method efficiently distills a high-capacity…

Machine Learning · Computer Science 2025-07-03 Dmytro Kuzmenko , Nadiya Shvai

Model Compression with Multi-Task Knowledge Distillation for Web-scale Question Answering System

Deep pre-training and fine-tuning models (like BERT, OpenAI GPT) have demonstrated excellent results in question answering areas. However, due to the sheer amount of model parameters, the inference speed of these models is very slow. How to…

Computation and Language · Computer Science 2019-04-23 Ze Yang , Linjun Shou , Ming Gong , Wutao Lin , Daxin Jiang

Weight Distillation: Transferring the Knowledge in Neural Network Parameters

Knowledge distillation has been proven to be effective in model acceleration and compression. It allows a small network to learn to generalize in the same way as a large network. Recent successes in pre-training suggest the effectiveness of…

Computation and Language · Computer Science 2021-07-20 Ye Lin , Yanyang Li , Ziyang Wang , Bei Li , Quan Du , Tong Xiao , Jingbo Zhu

Knowledge Distillation for Multi-task Learning

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with…

Computer Vision and Pattern Recognition · Computer Science 2020-09-25 Wei-Hong Li , Hakan Bilen

MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models

Pretrained language models have led to significant performance gains in many NLP tasks. However, the intensive computing resources to train such models remain an issue. Knowledge distillation alleviates this problem by learning a…

Computation and Language · Computer Science 2020-05-04 Linqing Liu , Huan Wang , Jimmy Lin , Richard Socher , Caiming Xiong

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

Vision Foundation Models (VFMs) pretrained on massive datasets exhibit impressive performance on various downstream tasks, especially with limited labeled target data. However, due to their high inference compute cost, these models cannot…

Computer Vision and Pattern Recognition · Computer Science 2024-07-03 Raviteja Vemulapalli , Hadi Pouransari , Fartash Faghri , Sachin Mehta , Mehrdad Farajtabar , Mohammad Rastegari , Oncel Tuzel

Reinforced Multi-Teacher Selection for Knowledge Distillation

In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation…

Computation and Language · Computer Science 2020-12-15 Fei Yuan , Linjun Shou , Jian Pei , Wutao Lin , Ming Gong , Yan Fu , Daxin Jiang

Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning

A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task. Transfer learning proposes to address this issue by re-using knowledge from previously…

Machine Learning · Computer Science 2023-04-28 Remo Sasso , Matthia Sabatelli , Marco A. Wiering

Efficient Multi-Task and Transfer Reinforcement Learning with Parameter-Compositional Framework

In this work, we investigate the potential of improving multi-task training and also leveraging it for transferring in the reinforcement learning setting. We identify several challenges towards this goal and propose a transferring approach…

Robotics · Computer Science 2023-06-06 Lingfeng Sun , Haichao Zhang , Wei Xu , Masayoshi Tomizuka

Multi-Task Multi-Scale Contrastive Knowledge Distillation for Efficient Medical Image Segmentation

This thesis aims to investigate the feasibility of knowledge transfer between neural networks for medical image segmentation tasks, specifically focusing on the transfer from a larger multi-task "Teacher" network to a smaller "Student"…

Image and Video Processing · Electrical Eng. & Systems 2024-06-06 Risab Biswas

Fractional Transfer Learning for Deep Model-Based Reinforcement Learning

Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient, as it enables them to…

Machine Learning · Computer Science 2021-08-17 Remo Sasso , Matthia Sabatelli , Marco A. Wiering

Distral: Robust Multitask Reinforcement Learning

Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network…

Machine Learning · Computer Science 2017-07-14 Yee Whye Teh , Victor Bapst , Wojciech Marian Czarnecki , John Quan , James Kirkpatrick , Raia Hadsell , Nicolas Heess , Razvan Pascanu

Knowledge Distillation via Weighted Ensemble of Teaching Assistants

Knowledge distillation in machine learning is the process of transferring knowledge from a large model called the teacher to a smaller model called the student. Knowledge distillation is one of the techniques to compress the large network…

Machine Learning · Computer Science 2022-06-27 Durga Prasad Ganta , Himel Das Gupta , Victor S. Sheng

Collaborative Distillation Strategies for Parameter-Efficient Language Model Deployment

This paper addresses the challenges of high computational cost and slow inference in deploying large language models. It proposes a distillation strategy guided by multiple teacher models. The method constructs several teacher models and…

Computation and Language · Computer Science 2025-07-22 Xiandong Meng , Yan Wu , Yexin Tian , Xin Hu , Tianze Kang , Junliang Du

Model-based adaptation for sample efficient transfer in reinforcement learning control of parameter-varying systems

In this paper, we leverage ideas from model-based control to address the sample efficiency problem of reinforcement learning (RL) algorithms. Accelerating learning is an active field of RL highly relevant in the context of time-varying…

Systems and Control · Electrical Eng. & Systems 2023-05-23 Ibrahim Ahmed , Marcos Quinones-Grueiro , Gautam Biswas

Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods…

Computer Vision and Pattern Recognition · Computer Science 2020-10-12 Mahdi Ghorbani , Fahimeh Fooladgar , Shohreh Kasaei

On effects of Knowledge Distillation on Transfer Learning

Knowledge distillation is a popular machine learning technique that aims to transfer knowledge from a large 'teacher' network to a smaller 'student' network and improve the student's performance by training it to emulate the teacher. In…

Machine Learning · Computer Science 2022-10-19 Sushil Thapa

Model-Based Transfer Learning for Contextual Reinforcement Learning

Deep reinforcement learning (RL) is a powerful approach to complex decision making. However, one issue that limits its practical application is its brittleness, sometimes failing to train in the presence of small changes in the environment.…

Machine Learning · Computer Science 2025-01-27 Jung-Hoon Cho , Vindula Jayawardana , Sirui Li , Cathy Wu

Can a student Large Language Model perform as well as it's teacher?

The burgeoning complexity of contemporary deep learning models, while achieving unparalleled accuracy, has inadvertently introduced deployment challenges in resource-constrained environments. Knowledge distillation, a technique aiming to…

Machine Learning · Computer Science 2023-10-05 Sia Gholami , Marwan Omar

Selective Knowledge Distillation for Neural Machine Translation

Neural Machine Translation (NMT) models achieve state-of-the-art performance on many translation benchmarks. As an active research field in NMT, knowledge distillation is widely applied to enhance the model's performance by transferring…

Computation and Language · Computer Science 2021-05-28 Fusheng Wang , Jianhao Yan , Fandong Meng , Jie Zhou