Related papers: Amalgamating Knowledge towards Comprehensive Class…

Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More

In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pretrained…

Computer Vision and Pattern Recognition · Computer Science 2019-04-24 Jingwen Ye , Yixin Ji , Xinchao Wang , Kairi Ou , Dapeng Tao , Mingli Song

Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations. However, due to the large…

Machine Learning · Computer Science 2019-06-26 Sihui Luo , Xinchao Wang , Gongfan Fang , Yao Hu , Dapeng Tao , Mingli Song

Customizing Student Networks From Heterogeneous Teachers via Adaptive Knowledge Amalgamation

A massive number of well-trained deep networks have been released by developers online. These networks may focus on different tasks and in many cases are optimized for different datasets. In this paper, we study how to exploit such…

Computer Vision and Pattern Recognition · Computer Science 2019-08-21 Chengchao Shen , Mengqi Xue , Xinchao Wang , Jie Song , Li Sun , Mingli Song

Class-Incremental Learning via Knowledge Amalgamation

Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent…

Machine Learning · Computer Science 2022-09-07 Marcus de Carvalho , Mahardhika Pratama , Jie Zhang , Yajuan San

Amalgamating Filtered Knowledge: Learning Task-customized Student from Multi-task Teachers

Many well-trained Convolutional Neural Network(CNN) models have now been released online by developers for the sake of effortless reproducing. In this paper, we treat such pre-trained networks as teachers and explore how to learn a target…

Machine Learning · Computer Science 2019-05-29 Jingwen Ye , Xinchao Wang , Yixin Ji , Kairi Ou , Mingli Song

Federated Selective Aggregation for Knowledge Amalgamation

In this paper, we explore a new knowledge-amalgamation problem, termed Federated Selective Aggregation (FedSA). The goal of FedSA is to train a student model for a new task with the help of several decentralized teachers, whose pre-training…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Donglin Xie , Ruonan Yu , Gongfan Fang , Jie Song , Zunlei Feng , Xinchao Wang , Li Sun , Mingli Song

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Recently, there has been a growing availability of pre-trained text models on various model repositories. These models greatly reduce the cost of training new models from scratch as they can be fine-tuned for specific tasks or trained on…

Computation and Language · Computer Science 2024-06-25 Prashanth Vijayaraghavan , Hongzhi Wang , Luyao Shi , Tyler Baldwin , David Beymer , Ehsan Degan

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential…

Computation and Language · Computer Science 2021-12-15 Lei Li , Yankai Lin , Xuancheng Ren , Guangxiang Zhao , Peng Li , Jie Zhou , Xu Sun

Knowledge Amalgamation for Object Detection with Transformers

Knowledge amalgamation (KA) is a novel deep model reusing task aiming to transfer knowledge from several well-trained teachers to a multi-talented and compact student. Currently, most of these approaches are tailored for convolutional…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Haofei Zhang , Feng Mao , Mengqi Xue , Gongfan Fang , Zunlei Feng , Jie Song , Mingli Song

Audio Embeddings as Teachers for Music Classification

Music classification has been one of the most popular tasks in the field of music information retrieval. With the development of deep learning models, the last decade has seen impressive improvements in a wide range of classification tasks.…

Sound · Computer Science 2023-07-03 Yiwei Ding , Alexander Lerch

Embedding Compression for Teacher-to-Student Knowledge Transfer

Common knowledge distillation methods require the teacher model and the student model to be trained on the same task. However, the usage of embeddings as teachers has also been proposed for different source tasks and target tasks. Prior…

Machine Learning · Computer Science 2024-02-13 Yiwei Ding , Alexander Lerch

EEML: Ensemble Embedded Meta-learning

To accelerate learning process with few samples, meta-learning resorts to prior knowledge from previous tasks. However, the inconsistent task distribution and heterogeneity is hard to be handled through a global sharing model…

Machine Learning · Computer Science 2022-06-22 Geng Li , Boyuan Ren , Hongzhi Wang

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

Ensemble models comprising of deep Convolutional Neural Networks (CNN) have shown significant improvements in model generalization but at the cost of large computation and memory requirements. In this paper, we present a framework for…

Computer Vision and Pattern Recognition · Computer Science 2020-04-03 Umar Asif , Jianbin Tang , Stefan Harrer

Multi-teacher knowledge distillation as an effective method for compressing ensembles of neural networks

Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have…

Machine Learning · Computer Science 2023-02-15 Konrad Zuchniak

Learning Student-Friendly Teacher Networks for Knowledge Distillation

We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student. Contrary to most of the existing methods that rely on effective training of student models given pretrained…

Machine Learning · Computer Science 2022-01-25 Dae Young Park , Moon-Hyun Cha , Changwook Jeong , Dae Sin Kim , Bohyung Han

UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models

In the era of deep learning, the increasing number of pre-trained models available online presents a wealth of knowledge. These models, developed with diverse architectures and trained on varied datasets for different tasks, provide unique…

Computer Vision and Pattern Recognition · Computer Science 2025-08-28 Yimu Wang , Weiming Zhuang , Chen Chen , Jiabo Huang , Jingtao Li , Lingjuan Lyu

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model

We study how to train a student deep neural network for visual recognition by distilling knowledge from a blackbox teacher model in a data-efficient manner. Progress on this problem can significantly reduce the dependence on large-scale…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Dongdong Wang , Yandong Li , Liqiang Wang , Boqing Gong

Teacher-Class Network: A Neural Network Compression Mechanism

To reduce the overwhelming size of Deep Neural Networks (DNN) teacher-student methodology tries to transfer knowledge from a complex teacher network to a simple student network. We instead propose a novel method called the teacher-class…

Machine Learning · Computer Science 2021-11-02 Shaiq Munir Malik , Muhammad Umair Haider , Mohbat Tharani , Musab Rasheed , Murtaza Taj

Efficient Knowledge Distillation from Model Checkpoints

Knowledge distillation is an effective approach to learn compact models (students) with the supervision of large and strong models (teachers). As empirically there exists a strong correlation between the performance of teacher and student…

Machine Learning · Computer Science 2022-10-13 Chaofei Wang , Qisen Yang , Rui Huang , Shiji Song , Gao Huang

Knowledge Distillation via Weighted Ensemble of Teaching Assistants

Knowledge distillation in machine learning is the process of transferring knowledge from a large model called the teacher to a smaller model called the student. Knowledge distillation is one of the techniques to compress the large network…

Machine Learning · Computer Science 2022-06-27 Durga Prasad Ganta , Himel Das Gupta , Victor S. Sheng