Related papers: Stacking Small Language Models for Generalizabilit…

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems

Large language models (LLMs) have demonstrated remarkable performance across a wide range of industrial applications, from search and recommendation systems to generative tasks. Although scaling laws indicate that larger models generally…

Information Retrieval · Computer Science 2025-10-28 Kayhan Behdin , Ata Fatahibaarzi , Qingquan Song , Yun Dai , Aman Gupta , Zhipeng Wang , Shao Tang , Hejian Sang , Gregory Dexter , Sirou Zhu , Siyu Zhu , Tejas Dharamsi , Vignesh Kothapalli , Zhoutong Fu , Yihan Cao , Pin-Lun Hsu , Fedor Borisyuk , Natesh Pillai , Luke Simon , Rahul Mazumder

Generalizing Large Language Model Usability Across Resource-Constrained

Large Language Models (LLMs) have achieved remarkable success across a wide range of natural language tasks, and recent efforts have sought to extend their capabilities to multimodal domains and resource-constrained environments. However,…

Machine Learning · Computer Science 2025-05-26 Yun-Da Tsai

Rethinking Data: Towards Better Performing Domain-Specific Small Language Models

Fine-tuning of Large Language Models (LLMs) for downstream tasks, performed on domain-specific data has shown significant promise. However, commercial use of such LLMs is limited by the high computational cost required for their deployment…

Computation and Language · Computer Science 2025-03-06 Boris Nazarov , Darya Frolova , Yackov Lubarsky , Alexei Gaissinski , Pavel Kisilev

Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation

Small Language models (SLMs) offer an efficient and accessible alternative to Large Language Models (LLMs), delivering strong performance while using far fewer resources. We introduce a simple and effective framework for pretraining SLMs…

Computation and Language · Computer Science 2026-01-15 Arjun Krishnakumar , Rhea Sanjay Sukthanker , Hannan Javed Mahadik , Gabriela Kadlecová , Vladyslav Moroshan , Timur Carstensen , Frank Hutter , Aaron Klein

Small Language Models: Survey, Measurements, and Insights

Small language models (SLMs), despite their widespread adoption in modern smart devices, have received significantly less academic attention compared to their large language model (LLM) counterparts, which are predominantly deployed in data…

Computation and Language · Computer Science 2025-02-27 Zhenyan Lu , Xiang Li , Dongqi Cai , Rongjie Yi , Fangming Liu , Xiwen Zhang , Nicholas D. Lane , Mengwei Xu

Small Language Models: Architectures, Techniques, Evaluation, Problems and Future Adaptation

Small Language Models (SLMs) have gained substantial attention due to their ability to execute diverse language tasks successfully while using fewer computer resources. These models are particularly ideal for deployment in limited…

Computation and Language · Computer Science 2025-05-30 Tanjil Hasan Sakib , Md. Tanzib Hosain , Md. Kishor Morol

A Survey of Small Language Models

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Computation and Language · Computer Science 2024-10-29 Chien Van Nguyen , Xuan Shen , Ryan Aponte , Yu Xia , Samyadeep Basu , Zhengmian Hu , Jian Chen , Mihir Parmar , Sasidhar Kunapuli , Joe Barrow , Junda Wu , Ashish Singh , Yu Wang , Jiuxiang Gu , Franck Dernoncourt , Nesreen K. Ahmed , Nedim Lipka , Ruiyi Zhang , Xiang Chen , Tong Yu , Sungchul Kim , Hanieh Deilamsalehy , Namyong Park , Mike Rimer , Zhehao Zhang , Huanrui Yang , Ryan A. Rossi , Thien Huu Nguyen

Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers

Large Language Models (LLMs) possess outstanding capabilities in addressing various natural language processing (NLP) tasks. However, the sheer size of these models poses challenges in terms of storage, training and inference due to the…

Computation and Language · Computer Science 2025-04-18 Shuzhou Yuan , Ercong Nie , Bolei Ma , Michael Färber

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

Large Language Models: An Applied Econometric Framework

Large language models (LLMs) enable researchers to analyze text at unprecedented scale and minimal cost. Researchers can now revisit old questions and tackle novel ones with rich data. We provide an econometric framework for realizing this…

Econometrics · Economics 2025-12-08 Jens Ludwig , Sendhil Mullainathan , Ashesh Rambachan

Does Model Size Matter? A Comparison of Small and Large Language Models for Requirements Classification

[Context and motivation] Large language models (LLMs) show notable results in natural language processing (NLP) tasks for requirements engineering (RE). However, their use is compromised by high computational cost, data sharing risks, and…

Software Engineering · Computer Science 2025-10-27 Mohammad Amin Zadenoori , Vincenzo De Martino , Jacek Dabrowski , Xavier Franch , Alessio Ferrari

Emissions and Performance Trade-off Between Small and Large Language Models

The advent of Large Language Models (LLMs) has raised concerns about their enormous carbon footprint, starting with energy-intensive training and continuing through repeated inference. This study investigates the potential of using…

Computation and Language · Computer Science 2026-01-15 Anandita Garg , Uma Gaba , Deepan Muthirayan , Anish Roy Chowdhury

SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models

When using supervised fine-tuning (SFT) to adapt large language models (LLMs) to specific domains, a significant challenge arises: should we use the entire SFT dataset for fine-tuning? Common practice often involves fine-tuning directly on…

Computation and Language · Computer Science 2025-05-26 Xiang Liu , Zhaoxiang Liu , Peng Wang , Kohou Wang , Huan Hu , Kai Wang , Shiguo Lian

Interpretability of Language Models via Task Spaces

The usual way to interpret language models (LMs) is to test their performance on different benchmarks and subsequently infer their internal processes. In this paper, we present an alternative approach, concentrating on the quality of LM…

Computation and Language · Computer Science 2024-06-11 Lucas Weber , Jaap Jumelet , Elia Bruni , Dieuwke Hupkes

Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models

Large language models (LLMs) are renowned for their extensive linguistic knowledge and strong generalization capabilities, but their high computational demands make them unsuitable for resource-constrained environments. In contrast, small…

Computation and Language · Computer Science 2025-06-10 Kyeonghyun Kim , Jinhee Jang , Juhwan Choi , Yoonji Lee , Kyohoon Jin , YoungBin Kim

Fine-tuning Large Language Models with Limited Data: A Survey and Practical Guide

Fine-tuning large language models (LLMs) with limited data poses a practical challenge in low-resource languages, specialized domains, and constrained deployment settings. While pre-trained LLMs provide strong foundations, effective…

Computation and Language · Computer Science 2025-10-29 Marton Szep , Daniel Rueckert , Rüdiger von Eisenhart-Rothe , Florian Hinterwimmer

Unveiling the Generalization Power of Fine-Tuned Large Language Models

While Large Language Models (LLMs) have demonstrated exceptional multitasking abilities, fine-tuning these models on downstream, domain-specific datasets is often necessary to yield superior performance on test sets compared to their…

Computation and Language · Computer Science 2024-03-15 Haoran Yang , Yumeng Zhang , Jiaqi Xu , Hongyuan Lu , Pheng Ann Heng , Wai Lam

Fine-tuning Large Language Models for Entity Matching

Generative large language models (LLMs) are a promising alternative to pre-trained language models for entity matching due to their high zero-shot performance and ability to generalize to unseen entities. Existing research on using LLMs for…

Computation and Language · Computer Science 2025-05-22 Aaron Steiner , Ralph Peeters , Christian Bizer

Evaluating Small Decoder-Only Language Models for Grammar Correction and Text Simplification

Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access,…

Computation and Language · Computer Science 2026-01-08 Anthony Lamelas

Supervised Knowledge Makes Large Language Models Better In-context Learners

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the…

Computation and Language · Computer Science 2024-04-12 Linyi Yang , Shuibai Zhang , Zhuohao Yu , Guangsheng Bao , Yidong Wang , Jindong Wang , Ruochen Xu , Wei Ye , Xing Xie , Weizhu Chen , Yue Zhang