English
Related papers

Related papers: Adapt-Pruner: Adaptive Structural Pruning for Effi…

200 papers

Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents significant challenges in both the…

Computation and Language · Computer Science 2023-09-29 Xinyin Ma , Gongfan Fang , Xinchao Wang

The popularity of LLaMA (Touvron et al., 2023a;b) and other recently emerged moderate-sized large language models (LLMs) highlights the potential of building smaller yet powerful LLMs. Regardless, the cost of training such models from…

Computation and Language · Computer Science 2024-04-12 Mengzhou Xia , Tianyu Gao , Zhiyuan Zeng , Danqi Chen

Large Language Models (LLMs), such as LLaMA and T5, have shown exceptional performance across various tasks through fine-tuning. Although low-rank adaption (LoRA) has emerged to cheaply fine-tune these LLMs on downstream tasks, their…

Machine Learning · Computer Science 2024-08-08 Mingyang Zhang , Hao Chen , Chunhua Shen , Zhen Yang , Linlin Ou , Xinyi Yu , Bohan Zhuang

Large language models (LLMs) have achieved outstanding performance in natural language processing, but enormous model sizes and high computational costs limit their practical deployment. Structured pruning can effectively reduce the…

Computation and Language · Computer Science 2025-03-11 Jun Kong , Xinge Ma , Jin Wang , Xuejie Zhang

Despite exceptional capabilities, Large Language Models (LLMs) still face deployment challenges due to their enormous size. Post-training structured pruning is a promising solution that prunes LLMs without the need for retraining, reducing…

Machine Learning · Computer Science 2025-02-21 Weizhong Huang , Yuxin Zhang , Xiawu Zheng , Fei Chao , Rongrong Ji

While Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities, their substantial computational and memory requirements pose significant barriers to practical deployment. Current parameter reduction techniques primarily…

Computation and Language · Computer Science 2025-07-29 Yiran Huang , Lukas Thede , Massimiliano Mancini , Wenjia Xu , Zeynep Akata

With the rapid scaling of large language models (LLMs), structured pruning has become a widely used technique to learn efficient, smaller models from larger ones, delivering superior performance compared to training similarly sized models…

Computation and Language · Computer Science 2025-06-04 Bairu Hou , Qibin Chen , Jianyu Wang , Guoli Yin , Chong Wang , Nan Du , Ruoming Pang , Shiyu Chang , Tao Lei

Large language models (LLMs) have rapidly advanced in recent years, achieving remarkable performance across a wide range of natural language processing tasks. However, this progress has come at the cost of increasingly large model sizes,…

Large language models(LLMs) have garnered significant attention and demonstrated impressive capabilities in a wide range of applications. However, due to their enormous computational costs, the deployment and application of LLMs are often…

Machine Learning · Computer Science 2025-05-30 Jialong Guo , Xinghao Chen , Yehui Tang , Yunhe Wang

Large Language Models (LLMs) have achieved significant success across various NLP tasks. However, their massive computational costs limit their widespread use, particularly in real-time applications. Structured pruning offers an effective…

Machine Learning · Computer Science 2025-03-06 Shengkun Tang , Oliver Sieberling , Eldar Kurtic , Zhiqiang Shen , Dan Alistarh

The recent advancements in large language models (LLMs) have significantly improved language understanding and generation capabilities. However, it is difficult to deploy LLMs on resource-constrained edge devices due to their high…

Computation and Language · Computer Science 2024-12-20 Haotian Zheng , Jinke Ren , Yushan Sun , Ruichen Zhang , Wenbo Zhang , Zhen Li , Dusit Niyato , Shuguang Cui , Yatong Han

Structured pruning of modern large language models (LLMs) has emerged as a way of decreasing their high computational needs. Width pruning reduces the size of projection weight matrices (e.g., by removing attention heads) while maintaining…

Machine Learning · Computer Science 2024-06-25 Bo-Kyeong Kim , Geonmin Kim , Tae-Ho Kim , Thibault Castells , Shinkook Choi , Junho Shin , Hyoung-Kyu Song

Large Language Models (LLMs) have achieved remarkable success across a wide spectrum of natural language processing tasks. However, their ever-growing scale introduces significant barriers to real-world deployment, including substantial…

Computation and Language · Computer Science 2026-01-07 Guangxin Wu , Hao Zhang , Zhang Zhibin , Jiafeng Guo , Xueqi Cheng

Fine-tuning and inference with large Language Models (LM) are generally known to be expensive. Parameter-efficient fine-tuning over pretrained LMs reduces training memory by updating a small number of LM parameters but does not improve…

Computation and Language · Computer Science 2024-06-05 Bowen Zhao , Hannaneh Hajishirzi , Qingqing Cao

Large language models (LLMs) have garnered significant attention for their remarkable capabilities across various domains, whose vast parameter scales present challenges for practical deployment. Structured pruning is an effective method to…

Artificial Intelligence · Computer Science 2024-12-25 Gui Ling , Ziyang Wang , Yuliang Yan , Qingwen Liu

Despite the remarkable success of Large Language Models (LLMs), the massive size poses significant deployment challenges, particularly on resource-constrained hardware. While existing LLM compression methods focus on quantization, pruning…

Artificial Intelligence · Computer Science 2023-10-12 Song Guo , Jiahang Xu , Li Lyna Zhang , Mao Yang

The extensive application of Large Language Models (LLMs) in generative coding tasks has raised concerns due to their high computational demands and energy consumption. Unlike previous structural pruning methods designed for classification…

Software Engineering · Computer Science 2025-04-25 Guang Yang , Yu Zhou , Xiangyu Zhang , Wei Cheng , Ke Liu , Xiang Chen , Terry Yue Zhuo , Taolue Chen

Large Language Models (LLMs) have achieved remarkable success in various natural language processing tasks, including language modeling, understanding, and generation. However, the increased memory and computational costs associated with…

Computation and Language · Computer Science 2024-11-05 Shangqian Gao , Chi-Heng Lin , Ting Hua , Tang Zheng , Yilin Shen , Hongxia Jin , Yen-Chang Hsu

While Large Vision Language Models (LVLMs) demonstrate impressive capabilities, their substantial computational and memory requirements pose deployment challenges on resource-constrained edge devices. Current parameter reduction techniques…

Computation and Language · Computer Science 2026-04-28 Yiran Huang , Lukas Thede , Massimiliano Mancini , Wenjia Xu , Zeynep Akata

The remarkable success of Large Language Models (LLMs) relies heavily on their substantial scale, which poses significant challenges during model deployment in terms of latency and memory consumption. Recently, numerous studies have…

Computation and Language · Computer Science 2024-12-19 Weiyu Huang , Yuezhou Hu , Guohao Jian , Jun Zhu , Jianfei Chen
‹ Prev 1 2 3 10 Next ›