Related papers: Efficient Stitchable Task Adaptation

Stitchable Neural Networks

The public model zoo containing enormous powerful pretrained model families (e.g., ResNet/DeiT) has reached an unprecedented scope than ever, which significantly contributes to the success of deep learning. As each model family consists of…

Machine Learning · Computer Science 2023-03-29 Zizheng Pan , Jianfei Cai , Bohan Zhuang

ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices

With the rapid development of deep learning, a growing number of pre-trained models have been publicly available. However, deploying these fixed models in real-world IoT applications is challenging because different devices possess…

Computer Vision and Pattern Recognition · Computer Science 2025-06-12 Maoyu Wang , Yao Lu , Jiaqi Nie , Zeyu Wang , Yun Lin , Qi Xuan , Guan Gui

Stitched ViTs are Flexible Vision Backbones

Large pretrained plain vision Transformers (ViTs) have been the workhorse for many downstream tasks. However, existing works utilizing off-the-shelf ViTs are inefficient in terms of training and deployment, because adopting ViTs with…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Zizheng Pan , Jing Liu , Haoyu He , Jianfei Cai , Bohan Zhuang

Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation

Speculative decoding accelerates LLM inference but suffers from performance degradation when target models are fine-tuned for specific domains. A naive solution is to retrain draft models for every target model, which is costly and…

Machine Learning · Computer Science 2026-03-11 Luxi Lin , Zhihang Lin , Zhanpeng Zeng , Yuhao Chen , Qingyu Zhang , Jixiang Luo , Xuelong Li , Rongrong Ji

Evolving Subnetwork Training for Large Language Models

Large language models have ushered in a new era of artificial intelligence research. However, their substantial training costs hinder further development and widespread adoption. In this paper, inspired by the redundancy in the parameters…

Computation and Language · Computer Science 2024-06-12 Hanqi Li , Lu Chen , Da Ma , Zijian Wu , Su Zhu , Kai Yu

StitchNet: Composing Neural Networks from Pre-Trained Fragments

We propose StitchNet, a novel neural network creation paradigm that stitches together fragments (one or more consecutive network layers) from multiple pre-trained neural networks. StitchNet allows the creation of high-performing neural…

Machine Learning · Computer Science 2023-09-26 Surat Teerapittayanon , Marcus Comiter , Brad McDanel , H. T. Kung

FISTA-Net: Learning A Fast Iterative Shrinkage Thresholding Network for Inverse Problems in Imaging

Inverse problems are essential to imaging applications. In this paper, we propose a model-based deep learning network, named FISTA-Net, by combining the merits of interpretability and generality of the model-based Fast Iterative…

Image and Video Processing · Electrical Eng. & Systems 2021-01-26 Jinxi Xiang , Yonggui Dong , Yunjie Yang

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

Recently, a new trend of exploring sparsity for accelerating neural network training has emerged, embracing the paradigm of training on the edge. This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for…

Machine Learning · Computer Science 2021-10-28 Geng Yuan , Xiaolong Ma , Wei Niu , Zhengang Li , Zhenglun Kong , Ning Liu , Yifan Gong , Zheng Zhan , Chaoyang He , Qing Jin , Siyue Wang , Minghai Qin , Bin Ren , Yanzhi Wang , Sijia Liu , Xue Lin

Model Stitching by Functional Latent Alignment

Evaluating functional similarity involves quantifying the degree to which independently trained neural networks learn functionally similar representations. Reliably inferring the functional similarity of these networks remains an open…

Machine Learning · Computer Science 2025-05-27 Ioannis Athanasiadis , Anmar Karmush , Michael Felsberg

Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation

Adapting pre-trained large language models (LLMs) is crucial but challenging due to their enormous size. Parameter-efficient fine-tuning (PEFT) techniques typically employ additive adapters applied to frozen model weights. To further reduce…

Machine Learning · Computer Science 2026-01-05 Tianyi Zhang , Junda Su , Aditya Desai , Oscar Wu , Zhaozhuo Xu , Anshumali Shrivastava

Revisiting Model Stitching In the Foundation Model Era

Model stitching, connecting early layers of one model (source) to later layers of another (target) via a light stitch layer, has served as a probe of representational compatibility. Prior work finds that models trained on the same dataset…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Zheda Mai , Ke Zhang , Fu-En Wang , Zixiao Ken Wang , Albert Y. C. Chen , Lu Xia , Min Sun , Wei-Lun Chao , Cheng-Hao Kuo

Given the wide range of deployment targets, flexible model selection is essential for optimizing performance within a given compute budget. Recent work demonstrates that stitching pretrained models within a model family enables…

Machine Learning · Computer Science 2026-05-29 Debopam Sanyal , Anantharaman Iyer , Alind Khare , Trisha Jain , Akshay Jajoo , Myungjin Lee , Clayton Kerce , Alexey Tumanov

TeST: Test-time Self-Training under Distribution Shift

Despite their recent success, deep neural networks continue to perform poorly when they encounter distribution shifts at test time. Many recently proposed approaches try to counter this by aligning the model to the new distribution prior to…

Computer Vision and Pattern Recognition · Computer Science 2022-09-26 Samarth Sinha , Peter Gehler , Francesco Locatello , Bernt Schiele

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast…

Computer Vision and Pattern Recognition · Computer Science 2023-09-01 Haoyu He , Jianfei Cai , Jing Zhang , Dacheng Tao , Bohan Zhuang

Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning

As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of…

Machine Learning · Computer Science 2020-06-23 Minyoung Song , Jaehong Yoon , Eunho Yang , Sung Ju Hwang

Efficient Transfer Learning for Video-language Foundation Models

Pre-trained vision-language models provide a robust foundation for efficient transfer learning across various downstream tasks. In the field of video action recognition, mainstream approaches often introduce additional modules to capture…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Haoxing Chen , Zizheng Huang , Yan Hong , Yanshuo Wang , Zhongcai Lyu , Zhuoer Xu , Jun Lan , Zhangxuan Gu

STDA-Net: Spectrogram-Based Domain Adaptation for cross-dataset Sleep Stage Classification

Accurate sleep stage classification across datasets remains challenging due to variability in EEG channel montages, sampling rates, recording environments, and subject populations. Although deep learning has shown considerable promise for…

Machine Learning · Computer Science 2026-05-11 Unaza Tallal , Shruti Kshirsagar , Ankita Shukla

Building Variable-sized Models via Learngene Pool

Recently, Stitchable Neural Networks (SN-Net) is proposed to stitch some pre-trained networks for quickly building numerous networks with different complexity and performance trade-offs. In this way, the burdens of designing or training the…

Machine Learning · Computer Science 2023-12-13 Boyu Shi , Shiyu Xia , Xu Yang , Haokun Chen , Zhiqiang Kou , Xin Geng

Learn Faster and Forget Slower via Fast and Stable Task Adaptation

Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current…

Neural and Evolutionary Computing · Computer Science 2020-12-01 Farshid Varno , Lucas May Petry , Lisa Di Jorio , Stan Matwin

Elastic Spiking Transformers for Efficient Gesture Understanding

Spiking Neural Networks (SNNs), particularly Spiking Transformers, offer energy-efficient processing of event-based sensor data for healthcare applications. Yet current architectures are rigid: they are trained and deployed as static…

Neural and Evolutionary Computing · Computer Science 2026-05-15 Alberto Ancilotto , Gianluca Amprimo , Stefano Di Carlo , Elisabetta Farella