Eigenpruning: an Interpretability-Inspired PEFT Method

Tomás Vergara-Browne; Álvaro Soto; Akiko Aizawa

Eigenpruning: an Interpretability-Inspired PEFT Method

Machine Learning 2024-06-21 v5 Artificial Intelligence

Authors: Tomás Vergara-Browne , Álvaro Soto , Akiko Aizawa

Abstract

We introduce eigenpruning, a method that removes singular values from weight matrices in an LLM to improve its performance in a particular task. This method is inspired by interpretability methods designed to automatically find subnetworks of a model which solve a specific task. In our tests, the pruned model outperforms the original model by a large margin, while only requiring minimal computation to prune the weight matrices. In the case of a small synthetic task in integer multiplication, the Phi-2 model can improve its accuracy in the test set from 13.75% to 97.50%. Interestingly, these results seem to indicate the existence of a computation path that can solve the task very effectively, but it was not being used by the original model. Finally, we publicly release our implementation.

Keywords

model compression network pruning parameter-efficient fine-tuning

Cite

@article{arxiv.2404.03147,
  title  = {Eigenpruning: an Interpretability-Inspired PEFT Method},
  author = {Tomás Vergara-Browne and Álvaro Soto and Akiko Aizawa},
  journal= {arXiv preprint arXiv:2404.03147},
  year   = {2024}
}

Comments

Extended abstract accepted to LatinX at NAACL 2024

Eigenpruning: an Interpretability-Inspired PEFT Method

Abstract

Keywords

Cite

Comments

Related papers