English
Related papers

Related papers: Robust Active Distillation

200 papers

The burgeoning complexity of contemporary deep learning models, while achieving unparalleled accuracy, has inadvertently introduced deployment challenges in resource-constrained environments. Knowledge distillation, a technique aiming to…

Machine Learning · Computer Science 2023-10-05 Sia Gholami , Marwan Omar

Recent work on distilling Whisper's knowledge into small models using pseudo-labels shows promising performance while reducing the size by up to 50%. This results in small, efficient, and dedicated models. However, a critical step of…

Computation and Language · Computer Science 2025-05-16 Abdul Waheed , Karima Kadaoui , Bhiksha Raj , Muhammad Abdul-Mageed

Knowledge distillation is an effective approach to leverage a well-trained network or an ensemble of them, named as the teacher, to guide the training of a student network. The outputs from the teacher network are used as soft labels for…

Machine Learning · Computer Science 2021-02-02 Helong Zhou , Liangchen Song , Jiajie Chen , Ye Zhou , Guoli Wang , Junsong Yuan , Qian Zhang

Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available,…

Machine Learning · Computer Science 2022-10-14 Fotis Iliopoulos , Vasilis Kontonis , Cenk Baykal , Gaurav Menghani , Khoa Trinh , Erik Vee

In this research, we propose an innovative method to boost Knowledge Distillation efficiency without the need for resource-heavy teacher models. Knowledge Distillation trains a smaller ``student'' model with guidance from a larger…

Machine Learning · Computer Science 2024-04-16 Divyang Doshi , Jung-Eun Kim

Knowledge distillation is a strategy of training a student network with guide of the soft output from a teacher network. It has been a successful method of model compression and knowledge transfer. However, currently knowledge distillation…

Machine Learning · Computer Science 2024-10-21 Guangda Ji , Zhanxing Zhu

Knowledge distillation is typically conducted by training a small model (the student) to mimic a large and cumbersome model (the teacher). The idea is to compress the knowledge from the teacher by using its output probabilities as…

Computation and Language · Computer Science 2020-01-17 Gustavo Aguilar , Yuan Ling , Yu Zhang , Benjamin Yao , Xing Fan , Chenlei Guo

In this paper, we introduce a novel knowledge distillation approach for the semantic segmentation task. Unlike previous methods that rely on power-trained teachers or other modalities to provide additional knowledge, our approach does not…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Shoumeng Qiu , Jie Chen , Xinrun Li , Ru Wan , Xiangyang Xue , Jian Pu

Data-free knowledge distillation is a challenging model lightweight task for scenarios in which the original dataset is not available. Previous methods require a lot of extra computational costs to update one or more generators and their…

Computer Vision and Pattern Recognition · Computer Science 2023-02-24 Yuzheng Wang , Zuhao Ge , Zhaoyu Chen , Xian Liu , Chuangjia Ma , Yunquan Sun , Lizhe Qi

Knowledge distillation has been widely adopted in a variety of tasks and has achieved remarkable successes. Since its inception, many researchers have been intrigued by the dark knowledge hidden in the outputs of the teacher model.…

Machine Learning · Computer Science 2023-02-17 Hua Yuan , Ning Xu , Yu Shi , Xin Geng , Yong Rui

The problem of learning from few labeled examples while using large amounts of unlabeled data has been approached by various semi-supervised methods. Although these methods can achieve superior performance, the models are often not…

Computer Vision and Pattern Recognition · Computer Science 2021-09-21 Sahil Khose , Shruti Jain , V Manushree

Dataset distillation aims to compress training data into fewer examples via a teacher, from which a student can learn effectively. While its success is often attributed to structure in the data, modern neural networks also memorize specific…

Machine Learning · Computer Science 2026-02-23 Freya Behrens , Lenka Zdeborová

Much of the focus in the area of knowledge distillation has been on distilling knowledge from a larger teacher network to a smaller student network. However, there has been little research on how the concept of distillation can be leveraged…

Neural and Evolutionary Computing · Computer Science 2019-01-29 Zhong Qiu Lin , Alexander Wong

Enhancing small language models for real-life application deployment is a significant challenge facing the research community. Due to the difficulties and costs of using large language models, researchers are seeking ways to effectively…

Computation and Language · Computer Science 2024-09-20 Mohamad Ballout , Ulf Krumnack , Gunther Heidemann , Kai-Uwe Kühnberger

Knowledge distillation with unlabeled examples is a powerful training paradigm for generating compact and lightweight student models in applications where the amount of labeled data is limited but one has access to a large pool of unlabeled…

Machine Learning · Computer Science 2023-06-12 Vasilis Kontonis , Fotis Iliopoulos , Khoa Trinh , Cenk Baykal , Gaurav Menghani , Erik Vee

Data $\textit{quality}$ is a crucial factor in the performance of machine learning models, a principle that dataset distillation methods exploit by compressing training datasets into much smaller counterparts that maintain similar…

Machine Learning · Computer Science 2025-01-22 Tian Qin , Zhiwei Deng , David Alvarez-Melis

The outpouring of various pre-trained models empowers knowledge distillation by providing abundant teacher resources, but there lacks a developed mechanism to utilize these teachers adequately. With a massive model repository composed of…

Machine Learning · Computer Science 2022-09-29 Su Lu , Han-Jia Ye , De-Chuan Zhan

In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation…

Computation and Language · Computer Science 2020-12-15 Fei Yuan , Linjun Shou , Jian Pei , Wutao Lin , Ming Gong , Yan Fu , Daxin Jiang

Knowledge distillation (KD) aims to distill the knowledge from the teacher (larger) to the student (smaller) model via soft-label for the efficient neural network. In general, the performance of a model is determined by accuracy, which is…

Signal Processing · Electrical Eng. & Systems 2025-08-25 Stephen Ekaputra Limantoro

Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoders. However, large…

Computation and Language · Computer Science 2024-06-21 Suman Adhya , Debarshi Kumar Sanyal
‹ Prev 1 2 3 10 Next ›