Related papers: Double Reverse Regularization Network Based on Sel…

Lightweight Self-Knowledge Distillation with Multi-source Information Fusion

Knowledge Distillation (KD) is a powerful technique for transferring knowledge between neural network models, where a pre-trained teacher model is used to facilitate the training of the target student model. However, the availability of a…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Xucong Wang , Pengchao Han , Lei Guo

Toward Robust Semi-supervised Regression via Dual-stream Knowledge Distillation

Semi-supervised regression (SSR), which aims to predict continuous scores for samples while reducing the reliance on large-scale labeled data, has recently attracted considerable attention across various applications, including computer…

Machine Learning · Computer Science 2026-05-28 Ye Su , Hezhe Qiao , Wei Huang , Lin Chen

Adaptive Regularization of Labels

Recently, a variety of regularization techniques have been widely applied in deep neural networks, such as dropout, batch normalization, data augmentation, and so on. These methods mainly focus on the regularization of weight parameters to…

Machine Learning · Computer Science 2019-08-16 Qianggang Ding , Sifan Wu , Hao Sun , Jiadong Guo , Shu-Tao Xia

Dual-frequency Selected Knowledge Distillation with Statistical-based Sample Rectification for PolSAR Image Classification

The collaborative classification of dual-frequency PolSAR images is a meaningful but also challenging research. The effect of regional consistency on classification information learning and the rational use of dual-frequency data are two…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Xinyue Xin , Ming Li , Yan Wu , Xiang Li , Peng Zhang , Dazhi Xu

Knowledge distillation for semi-supervised domain adaptation

In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen…

Machine Learning · Computer Science 2019-08-21 Mauricio Orbes-Arteaga , Jorge Cardoso , Lauge Sørensen , Christian Igel , Sebastien Ourselin , Marc Modat , Mads Nielsen , Akshay Pai

Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly Detection

Due to the data imbalance and the diversity of defects, student-teacher networks (S-T) are favored in unsupervised anomaly detection, which explores the discrepancy in feature representation derived from the knowledge distillation process…

Computer Vision and Pattern Recognition · Computer Science 2024-02-02 Liyi Yao , Shaobing Gao

Weight Averaging Improves Knowledge Distillation under Domain Shift

Knowledge distillation (KD) is a powerful model compression technique broadly used in practical deep learning applications. It is focused on training a small student network to mimic a larger teacher network. While it is widely known that…

Machine Learning · Computer Science 2023-09-21 Valeriy Berezovskiy , Nikita Morozov

Self-Knowledge Distillation with Progressive Refinement of Targets

The generalization capability of deep neural networks has been substantially improved by applying a wide spectrum of regularization methods, e.g., restricting function space, injecting randomness during training, augmenting data, etc. In…

Machine Learning · Computer Science 2021-10-08 Kyungyul Kim , ByeongMoon Ji , Doyoung Yoon , Sangheum Hwang

Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search

Sequential recommender systems (SRS) have become a research hotspot due to its power in modeling user dynamic interests and sequential behavioral patterns. To maximize model expressive ability, a default choice is to apply a larger and…

Information Retrieval · Computer Science 2022-04-12 Lei Chen , Fajie Yuan , Jiaxi Yang , Min Yang , Chengming Li

Dual Correction Strategy for Ranking Distillation in Top-N Recommender System

Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems. Recently, Relaxed…

Information Retrieval · Computer Science 2024-05-16 Youngjune Lee , Kee-Eung Kim

Building Lightweight Semantic Segmentation Models for Aerial Images Using Dual Relation Distillation

Recently, there have been significant improvements in the accuracy of CNN models for semantic segmentation. However, these models are often heavy and suffer from low inference speed, which limits their practical application. To address this…

Image and Video Processing · Electrical Eng. & Systems 2025-06-27 Minglong Li , Lianlei Shan , Weiqiang Wang , Ke Lv , Bin Luo , Si-Bao Chen

Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. To reduce the necessity of…

Computer Vision and Pattern Recognition · Computer Science 2021-03-16 Mingi Ji , Seungjae Shin , Seunghyun Hwang , Gibeom Park , Il-Chul Moon

Dual Relation Knowledge Distillation for Object Detection

Knowledge distillation is an effective method for model compression. However, it is still a challenging topic to apply knowledge distillation to detection tasks. There are two key points resulting in poor distillation performance for…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Zhenliang Ni , Fukui Yang , Shengzhao Wen , Gang Zhang

A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition

Knowledge distillation is an effective transfer of knowledge from a heavy network (teacher) to a small network (student) to boost students' performance. Self-knowledge distillation, the special case of knowledge distillation, has been…

Computer Vision and Pattern Recognition · Computer Science 2022-09-07 Duc-Quang Vu , Trang Phung , Jia-Ching Wang

Dynamic Rectification Knowledge Distillation

Knowledge Distillation is a technique which aims to utilize dark knowledge to compress and transfer information from a vast, well-trained neural network (teacher model) to a smaller, less capable neural network (student model) with improved…

Computer Vision and Pattern Recognition · Computer Science 2022-01-28 Fahad Rahman Amik , Ahnaf Ismat Tasin , Silvia Ahmed , M. M. Lutfe Elahi , Nabeel Mohammed

Data Upcycling Knowledge Distillation for Image Super-Resolution

Knowledge distillation (KD) compresses deep neural networks by transferring task-related knowledge from cumbersome pre-trained teacher models to compact student models. However, current KD methods for super-resolution (SR) networks overlook…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Yun Zhang , Wei Li , Simiao Li , Hanting Chen , Zhijun Tu , Wenjia Wang , Bingyi Jing , Shaohui Lin , Jie Hu

Change Detection from Synthetic Aperture Radar Images via Dual Path Denoising Network

Benefited from the rapid and sustainable development of synthetic aperture radar (SAR) sensors, change detection from SAR images has received increasing attentions over the past few years. Existing unsupervised deep learning-based methods…

Image and Video Processing · Electrical Eng. & Systems 2022-03-15 Junjie Wang , Feng Gao , Junyu Dong , Qian Du , Heng-Chao Li

Unlocking the Potential of Reverse Distillation for Anomaly Detection

Knowledge Distillation (KD) is a promising approach for unsupervised Anomaly Detection (AD). However, the student network's over-generalization often diminishes the crucial representation differences between teacher and student in anomalous…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Xinyue Liu , Jianyuan Wang , Biao Leng , Shuo Zhang

Self-Knowledge Distillation via Dropout

To boost the performance, deep neural networks require deeper or wider network structures that involve massive computational and memory costs. To alleviate this issue, the self-knowledge distillation method regularizes the model by…

Computer Vision and Pattern Recognition · Computer Science 2022-08-12 Hyoje Lee , Yeachan Park , Hyun Seo , Myungjoo Kang

Adaptive Dual-Teacher Distillation with Subnetwork Rectification for Bridging Semantic Gaps in Black-Box Domain Adaptation

Assuming that neither source data nor source model parameters are accessible, black-box domain adaptation (BBDA) represents a highly practical yet challenging setting, where transferable knowledge is limited to the predictions of a…

Computer Vision and Pattern Recognition · Computer Science 2026-05-04 Zhe Zhang , Jing Li , Wanli Xue , Xu Cheng , Jianhua Zhang , Qinghua Hu , Shengyong Chen