Related papers: Analyzing Bagging Methods for Language Models

Compressing Large Language Models with Automated Sub-Network Search

Large Language Models (LLMs) demonstrate exceptional reasoning abilities, enabling strong generalization across diverse tasks such as commonsense reasoning and instruction following. However, as LLMs scale, inference costs become…

Computation and Language · Computer Science 2025-02-06 Rhea Sanjay Sukthanker , Benedikt Staffler , Frank Hutter , Aaron Klein

Large Language Models Are Overparameterized Text Encoders

Large language models (LLMs) demonstrate strong performance as text embedding models when finetuned with supervised contrastive training. However, their large size balloons inference time and memory requirements. In this paper, we show that…

Computation and Language · Computer Science 2024-10-21 Thennal D K , Tim Fischer , Chris Biemann

A Survey on Model Compression for Large Language Models

Large Language Models (LLMs) have transformed natural language processing tasks successfully. Yet, their large size and high computational needs pose challenges for practical use, especially in resource-limited settings. Model compression…

Computation and Language · Computer Science 2024-07-31 Xunyu Zhu , Jian Li , Yong Liu , Can Ma , Weiping Wang

Structured Pruning of Large Language Models

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly,…

Computation and Language · Computer Science 2021-03-30 Ziheng Wang , Jeremy Wohlwend , Tao Lei

Do Generative Large Language Models need billions of parameters?

This paper presents novel systems and methodologies for the development of efficient large language models (LLMs). It explores the trade-offs between model size, performance, and computational resources, with the aim of maximizing the…

Computation and Language · Computer Science 2023-09-14 Sia Gholami , Marwan Omar

Small Language Models (SLMs) Can Still Pack a Punch: A survey (updated 2026)

As foundation AI models continue to increase in size, an important question arises - is massive scale the only path forward? This survey of about 160 papers presents a family of Small Language Models (SLMs) in the 1 to 8 billion parameter…

Computation and Language · Computer Science 2026-05-15 Akanksha Gupta , Bijo Thomas , Harshita Asnani , Phanindra Reddy Madduru , Samia Feroze , Shreyas Subramanian , Vikram Elango , Mecit Gungor

Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers

Large Language Models (LLMs) possess outstanding capabilities in addressing various natural language processing (NLP) tasks. However, the sheer size of these models poses challenges in terms of storage, training and inference due to the…

Computation and Language · Computer Science 2025-04-18 Shuzhou Yuan , Ercong Nie , Bolei Ma , Michael Färber

Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability

The exponential growth of large language models (LLMs) like ChatGPT has revolutionized artificial intelligence, offering unprecedented capabilities in natural language processing. However, the extensive computational resources required for…

Computation and Language · Computer Science 2025-02-25 Ashhadul Islam , Samir Brahim Belhaouari , Amine Bermak

Systematic Evaluation of Optimization Techniques for Long-Context Language Models

Large language models (LLMs) excel across diverse natural language processing tasks but face resource demands and limited context windows. Although techniques like pruning, quantization, and token dropping can mitigate these issues, their…

Computation and Language · Computer Science 2025-08-04 Ammar Ahmed , Sheng Di , Franck Cappello , Zirui Liu , Jingoo Han , Ali Anwar

Crafting Efficient Fine-Tuning Strategies for Large Language Models

This paper addresses the challenges of efficiently fine-tuning large language models (LLMs) by exploring data efficiency and hyperparameter optimization. We investigate the minimum data required for effective fine-tuning and propose a novel…

Computation and Language · Computer Science 2024-07-22 Michael Oliver , Guan Wang

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Many efforts have been made to facilitate natural language processing tasks with pre-trained language models (LMs), and brought significant improvements to various applications. To fully leverage the nearly unlimited corpora and capture…

Computation and Language · Computer Science 2018-09-11 Liyuan Liu , Xiang Ren , Jingbo Shang , Jian Peng , Jiawei Han

Can pruning make Large Language Models more efficient?

Transformer models have revolutionized natural language processing with their unparalleled ability to grasp complex contextual relationships. However, the vast number of parameters in these models has raised concerns regarding computational…

Machine Learning · Computer Science 2023-10-10 Sia Gholami , Marwan Omar

Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Language models have steadily increased in size over the past few years. They achieve a high level of performance on various natural language processing (NLP) tasks such as question answering and summarization. Large language models (LLMs)…

Computation and Language · Computer Science 2023-01-31 Jessica Huynh , Cathy Jiao , Prakhar Gupta , Shikib Mehri , Payal Bajaj , Vishrav Chaudhary , Maxine Eskenazi

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

The success of neural language models (LMs) on many technological tasks has brought about their potential relevance as scientific theories of language despite some clear differences between LM training and child language acquisition. In…

Computation and Language · Computer Science 2026-03-30 Héctor Javier Vázquez Martínez , Annika Lea Heuser , Charles Yang , Jordan Kodner

A Review on Edge Large Language Models: Design, Execution, and Applications

Large language models (LLMs) have revolutionized natural language processing with their exceptional understanding, synthesizing, and reasoning capabilities. However, deploying LLMs on resource-constrained edge devices presents significant…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-25 Yue Zheng , Yuhao Chen , Bin Qian , Xiufang Shi , Yuanchao Shu , Jiming Chen

1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?

Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages. Despite their capabilities, they exhibit inconsistencies in handling identical queries in…

Computation and Language · Computer Science 2024-06-24 Yue Huang , Chenrui Fan , Yuan Li , Siyuan Wu , Tianyi Zhou , Xiangliang Zhang , Lichao Sun

Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations

Large language models are ubiquitous in natural language processing because they can adapt to new tasks without retraining. However, their sheer scale and complexity present unique challenges and opportunities, prompting researchers and…

Computation and Language · Computer Science 2024-08-07 Leo Donisch , Sigurd Schacht , Carsten Lanquillon

Are Multilingual Models Effective in Code-Switching?

Multilingual language models have shown decent performance in multilingual and cross-lingual natural language understanding tasks. However, the power of these multilingual models in code-switching tasks has not been fully explored. In this…

Computation and Language · Computer Science 2021-03-25 Genta Indra Winata , Samuel Cahyawijaya , Zihan Liu , Zhaojiang Lin , Andrea Madotto , Pascale Fung

A Survey of Small Language Models

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Computation and Language · Computer Science 2024-10-29 Chien Van Nguyen , Xuan Shen , Ryan Aponte , Yu Xia , Samyadeep Basu , Zhengmian Hu , Jian Chen , Mihir Parmar , Sasidhar Kunapuli , Joe Barrow , Junda Wu , Ashish Singh , Yu Wang , Jiuxiang Gu , Franck Dernoncourt , Nesreen K. Ahmed , Nedim Lipka , Ruiyi Zhang , Xiang Chen , Tong Yu , Sungchul Kim , Hanieh Deilamsalehy , Namyong Park , Mike Rimer , Zhehao Zhang , Huanrui Yang , Ryan A. Rossi , Thien Huu Nguyen

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

Large language models (LLMs) have demonstrated remarkable abilities in representation learning for program synthesis and understanding tasks. The quality of the learned representations appears to be dictated by the neural scaling laws as a…

Machine Learning · Computer Science 2023-07-13 Erik Nijkamp , Hiroaki Hayashi , Caiming Xiong , Silvio Savarese , Yingbo Zhou