Sparse Structure Search for Parameter-Efficient Tuning

Shengding Hu; Zhen Zhang; Ning Ding; Yadao Wang; Yasheng Wang; Zhiyuan Liu; Maosong Sun

Sparse Structure Search for Parameter-Efficient Tuning

Computation and Language 2022-06-16 v1

Authors: Shengding Hu , Zhen Zhang , Ning Ding , Yadao Wang , Yasheng Wang , Zhiyuan Liu , Maosong Sun

Abstract

Adapting large pre-trained models (PTMs) through fine-tuning imposes prohibitive computational and storage burdens. Recent studies of parameter-efficient tuning (PET) find that only optimizing a small portion of parameters conditioned on PTMs could yield on-par performance compared to conventional fine-tuning. Generally, PET methods exquisitely design parameter-efficient modules (PET modules) which could be applied to arbitrary fine-grained positions inside PTMs. However, the effectiveness of these fine-grained positions largely relies on sophisticated manual designation, thereby usually producing sub-optimal results. In contrast to the manual designation, we explore constructing PET modules in an automatic manner. We automatically \textbf{S}earch for the \textbf{S}parse \textbf{S}tructure of \textbf{P}arameter-\textbf{E}fficient \textbf{T}uning (S $^3$ PET). Based on a unified framework of various PET methods, S $^3$ PET conducts the differentiable PET structure search through bi-level optimization and proposes shifted global sigmoid method to explicitly control the number of trainable parameters. Extensive experiments show that S $^3$ PET surpasses manual and random structures with less trainable parameters. The searched structures preserve more than 99\% fine-tuning performance with 0.01\% trainable parameters. Moreover, the advantage of S $^3$ PET is amplified with extremely low trainable parameters budgets (0.0009\% $\sim$ 0.01\%). The searched structures are transferable and explainable, providing suggestions and guidance for the future design of PET methods.

Keywords

parameter-efficient fine-tuning process optimization fault tree analysis

Cite

@article{arxiv.2206.07382,
  title  = {Sparse Structure Search for Parameter-Efficient Tuning},
  author = {Shengding Hu and Zhen Zhang and Ning Ding and Yadao Wang and Yasheng Wang and Zhiyuan Liu and Maosong Sun},
  journal= {arXiv preprint arXiv:2206.07382},
  year   = {2022}
}

Sparse Structure Search for Parameter-Efficient Tuning

Abstract

Keywords

Cite

Related papers