Related papers: Self-Distilled Self-Supervised Representation Lear…

Distilling Visual Priors from Self-Supervised Learning

Convolutional Neural Networks (CNNs) are prone to overfit small training datasets. We present a novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models…

Computer Vision and Pattern Recognition · Computer Science 2020-08-04 Bingchen Zhao , Xin Wen

Distilling Vision Transformers for Distortion-Robust Representation Learning

Self-supervised learning has achieved remarkable success in learning visual representations from clean data, yet remains challenging when clean observations are sparse or not available at all. In this paper, we demonstrate that pretrained…

Computer Vision and Pattern Recognition · Computer Science 2026-04-27 Konstantinos Alexis , Giorgos Giannopoulos , Dimitrios Gunopulos

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization. While some approaches…

Computation and Language · Computer Science 2022-12-23 Chris Lengerich , Gabriel Synnaeve , Amy Zhang , Hugh Leather , Kurt Shuster , François Charton , Charysse Redwood

Attention Distillation: self-supervised vision transformer students need more guidance

Self-supervised learning has been widely applied to train high-quality vision transformers. Unleashing their excellent performance on memory and compute constraint devices is therefore an important research topic. However, how to distill…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Kai Wang , Fei Yang , Joost van de Weijer

Self-Supervised Models are Continual Learners

Self-supervised models have been shown to produce comparable or better visual representations than their supervised counterparts when trained offline on unlabeled data at scale. However, their efficacy is catastrophically reduced in a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-04 Enrico Fini , Victor G. Turrisi da Costa , Xavier Alameda-Pineda , Elisa Ricci , Karteek Alahari , Julien Mairal

Self-Distillation of Hidden Layers for Self-Supervised Representation Learning

The landscape of self-supervised learning (SSL) is currently dominated by generative approaches (e.g., MAE) that reconstruct raw low-level data, and predictive approaches (e.g., I-JEPA) that predict high-level abstract embeddings. While…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Scott C. Lowe , Anthony Fuller , Sageev Oore , Evan Shelhamer , Graham W. Taylor

Leveraging Auto-Distillation and Generative Self-Supervised Learning in Residual Graph Transformers for Enhanced Recommender Systems

This paper introduces a cutting-edge method for enhancing recommender systems through the integration of generative self-supervised learning (SSL) with a Residual Graph Transformer. Our approach emphasizes the importance of superior data…

Information Retrieval · Computer Science 2025-04-16 Eya Mhedhbi , Youssef Mourchid , Alice Othmani

Self-Distilled Representation Learning for Time Series

Self-supervised learning for time-series data holds potential similar to that recently unleashed in Natural Language Processing and Computer Vision. While most existing works in this area focus on contrastive learning, we propose a…

Machine Learning · Computer Science 2023-11-21 Felix Pieper , Konstantin Ditschuneit , Martin Genzel , Alexandra Lindt , Johannes Otterbach

DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

While self-supervised representation learning (SSL) has received widespread attention from the community, recent research argue that its performance will suffer a cliff fall when the model size decreases. The current method mainly relies on…

Computer Vision and Pattern Recognition · Computer Science 2022-07-05 Yuting Gao , Jia-Xin Zhuang , Shaohui Lin , Hao Cheng , Xing Sun , Ke Li , Chunhua Shen

SEED: Self-supervised Distillation For Visual Representation

This paper is concerned with self-supervised learning for small models. The problem is motivated by our empirical studies that while the widely used contrastive self-supervised learning method has shown great progress on large model…

Computer Vision and Pattern Recognition · Computer Science 2021-04-19 Zhiyuan Fang , Jianfeng Wang , Lijuan Wang , Lei Zhang , Yezhou Yang , Zicheng Liu

SDHSI-Net: Learning Better Representations for Hyperspectral Images via Self-Distillation

Hyperspectral image (HSI) classification presents unique challenges due to its high spectral dimensionality and limited labeled data. Traditional deep learning models often suffer from overfitting and high computational costs.…

Computer Vision and Pattern Recognition · Computer Science 2026-01-13 Prachet Dev Singh , Shyamsundar Paramasivam , Sneha Barman , Mainak Singha , Ankit Jha , Girish Mishra , Biplab Banerjee

Towards Compact Single Image Super-Resolution via Contrastive Self-distillation

Convolutional neural networks (CNNs) are highly successful for super-resolution (SR) but often require sophisticated architectures with heavy memory cost and computational overhead, significantly restricts their practical deployments on…

Computer Vision and Pattern Recognition · Computer Science 2021-05-26 Yanbo Wang , Shaohui Lin , Yanyun Qu , Haiyan Wu , Zhizhong Zhang , Yuan Xie , Angela Yao

Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning

In recent years, pre-trained multimodal large models have attracted widespread attention due to their outstanding performance in various multimodal applications. Nonetheless, the extensive computational resources and vast datasets required…

Computer Vision and Pattern Recognition · Computer Science 2024-04-18 Zhengyang Liang , Meiyu Liang , Wei Huang , Yawen Li , Zhe Xue

Self-Distillation Improves DNA Sequence Inference

Self-supervised pretraining (SSP) has been recognized as a method to enhance prediction accuracy in various downstream tasks. However, its efficacy for DNA sequences remains somewhat constrained. This limitation stems primarily from the…

Machine Learning · Computer Science 2024-05-15 Tong Yu , Lei Cheng , Ruslan Khalitov , Erland Brandser Olsson , Zhirong Yang

Self-Supervised Dataset Distillation for Transfer Learning

Dataset distillation methods have achieved remarkable success in distilling a large dataset into a small set of representative samples. However, they are not designed to produce a distilled dataset that can be effectively used for…

Machine Learning · Computer Science 2024-04-15 Dong Bok Lee , Seanie Lee , Joonho Ko , Kenji Kawaguchi , Juho Lee , Sung Ju Hwang

Randomly Initialized Networks Can Learn from Peer-to-Peer Consensus

In self-supervised learning, self-distilled methods have shown impressive performance, learning representations useful for downstream tasks and even displaying emergent properties. However, state-of-the-art methods usually rely on ensembles…

Machine Learning · Computer Science 2026-05-01 Esteban Rodríguez-Betancourt , Edgar Casasola-Murillo

Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones

Recently, research efforts have been concentrated on revealing how pre-trained model makes a difference in neural network performance. Self-supervision and semi-supervised learning technologies have been extensively explored by the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-11 Cheng Cui , Ruoyu Guo , Yuning Du , Dongliang He , Fu Li , Zewu Wu , Qiwen Liu , Shilei Wen , Jizhou Huang , Xiaoguang Hu , Dianhai Yu , Errui Ding , Yanjun Ma

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data. However, these models are commonly pretrained…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Yuang Liu , Jing Wang , Qiang Zhou , Fan Wang , Jun Wang , Wei Zhang

Contrastive Supervised Distillation for Continual Representation Learning

In this paper, we propose a novel training procedure for the continual representation learning problem in which a neural network model is sequentially learned to alleviate catastrophic forgetting in visual search tasks. Our method, called…

Computer Vision and Pattern Recognition · Computer Science 2022-06-13 Tommaso Barletti , Niccolo' Biondi , Federico Pernici , Matteo Bruni , Alberto Del Bimbo

Hierarchical Self-supervised Augmented Knowledge Distillation

Knowledge distillation often involves how to define and transfer knowledge from teacher to student effectively. Although recent self-supervised contrastive knowledge achieves the best performance, forcing the network to learn such knowledge…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Chuanguang Yang , Zhulin An , Linhang Cai , Yongjun Xu