English
Related papers

Related papers: Eigenpruning: an Interpretability-Inspired PEFT Me…

200 papers

Recent work on pruning large language models (LLMs) has shown that one can eliminate a large number of parameters without compromising performance, making pruning a promising strategy to reduce LLM model size. Existing LLM pruning…

Machine Learning · Computer Science 2024-10-16 Haiquan Lu , Yefan Zhou , Shiwei Liu , Zhangyang Wang , Michael W. Mahoney , Yaoqing Yang

Large-scale foundation models have demonstrated remarkable versatility across a wide range of downstream tasks. However, fully fine-tuning these models incurs prohibitive computational costs, motivating the development of…

Machine Learning · Computer Science 2025-05-30 Chongjie Si , Xuankun Yang , Muqing Liu , Yadao Wang , Xiaokang Yang , Wenbo Su , Bo Zheng , Wei Shen

Parameter Efficient Fine-Tuning (PEFT) methods have emerged as effective and promising approaches for fine-tuning pre-trained language models. Compared with Full parameter Fine-Tuning (FFT), PEFT achieved comparable task performance with a…

Machine Learning · Computer Science 2025-06-10 Tongzhou Yu , Zhuhao Zhang , Guanghui Zhu , Shen Jiang , Meikang Qiu , Yihua Huang

How is knowledge stored in an LLM's weights? We study this via layer pruning: if removing a certain layer does not affect model performance in common question-answering benchmarks, then the weights in that layer are not necessary for…

Computation and Language · Computer Science 2025-03-04 Andrey Gromov , Kushal Tirumala , Hassan Shapourian , Paolo Glorioso , Daniel A. Roberts

Parameter-efficient fine-tuning (PEFT) has emerged as the predominant technique for fine-tuning in the era of large language models. However, existing PEFT methods still have inadequate training efficiency. Firstly, the utilization of…

Computation and Language · Computer Science 2024-06-07 Naibin Gu , Peng Fu , Xiyu Liu , Bowen Shen , Zheng Lin , Weiping Wang

The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothesis space, and consequently to the vast amount of parameters that characterize model…

Machine Learning · Computer Science 2026-02-25 Enrico Ballini , Luca Muscarnera , Alessio Fumagalli , Anna Scotti , Francesco Regazzoni

The evolving capabilities of large language models are accompanied by growing sizes and deployment costs, necessitating effective inference optimisation techniques. We propose a novel pruning method utilising centrality measures from graph…

Machine Learning · Computer Science 2024-12-02 David Hoffmann , Kailash Budhathoki , Matthaeus Kleindessner

Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants. However, their growing capabilities come at the cost of extremely large model…

Machine Learning · Computer Science 2025-02-27 Yingyu Liang , Jiangxuan Long , Zhenmei Shi , Zhao Song , Yufa Zhou

Iterative pruning is one of the most effective compression methods for pre-trained language models. We discovered that finding the optimal pruning decision is an equality-constrained 0-1 Integer Linear Programming problem. The solution to…

Computation and Language · Computer Science 2023-05-23 Siyu Ren , Kenny Q. Zhu

Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. Selective PEFT, a class of parameter-efficient fine-tuning (PEFT) methodologies, aims to mitigate these computational challenges by…

Computation and Language · Computer Science 2025-06-24 Aradhye Agarwal , Suhas K Ramesh , Ayan Sengupta , Tanmoy Chakraborty

Existing fine-tuning methods use a single learning rate over all layers. In this paper, first, we discuss that trends of layer-wise weight variations by fine-tuning using a single learning rate do not match the well-known notion that…

Computer Vision and Pattern Recognition · Computer Science 2021-01-05 Youngmin Ro , Jin Young Choi

Pruning provides a practical solution to reduce the resources required to run large language models (LLMs) to benefit from their effective capabilities as well as control their cost for training and inference. Research on LLM pruning often…

Computation and Language · Computer Science 2025-10-28 Yuanhe Tian , Junjie Liu , Xican Yang , Haishan Ye , Yan Song

This paper presents a novel differentiable method for unstructured weight pruning of deep neural networks. Our learned-threshold pruning (LTP) method learns per-layer thresholds via gradient descent, unlike conventional methods where they…

Machine Learning · Computer Science 2021-03-22 Kambiz Azarian , Yash Bhalgat , Jinwon Lee , Tijmen Blankevoort

Adapting pre-trained neural models to downstream tasks has become the standard practice for obtaining high-quality models. In this work, we propose a novel model adaptation paradigm, adapting by pruning, which prunes neural connections in…

Machine Learning · Computer Science 2021-05-10 Yang Gao , Nicolo Colombo , Wei Wang

The rapid advancements in Large Language Models (LLMs) have revolutionized natural language processing (NLP) and related fields. However, fine-tuning these models for specific tasks remains computationally expensive and risks degrading…

Computation and Language · Computer Science 2024-12-17 Md Kowsher , Nusrat Jahan Prottasha , Prakash Bhat

The exponential growth of large language models (LLMs) like ChatGPT has revolutionized artificial intelligence, offering unprecedented capabilities in natural language processing. However, the extensive computational resources required for…

Computation and Language · Computer Science 2025-02-25 Ashhadul Islam , Samir Brahim Belhaouari , Amine Bermak

The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the…

Parameter-Efficient Fine-Tuning (PEFT) is a popular class of techniques that strive to adapt large models in a scalable and resource-efficient manner. Yet, the mechanisms underlying their training performance and generalization remain…

Machine Learning · Computer Science 2026-02-10 Zahra Rahimi Afzal , Tara Esmaeilbeig , Mojtaba Soltanalian , Mesrob I. Ohannessian

To remove redundant components of large language models (LLMs) without incurring significant computational costs, this work focuses on single-shot pruning without a retraining phase. We simplify the pruning process for Transformer-based…

Artificial Intelligence · Computer Science 2024-07-30 Jianwei Li , Yijun Dong , Qi Lei

Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing the memory footprint of large language models (LLMs) during inference. Notably, popular inference engines, such as vLLM, enable users to…

Machine Learning · Computer Science 2026-04-07 Kazuki Egashira , Robin Staab , Thibaud Gloaguen , Mark Vero , Martin Vechev
‹ Prev 1 2 3 10 Next ›