English
Related papers

Related papers: Stacking Small Language Models for Generalizabilit…

200 papers

Large language models (LLMs) have demonstrated remarkable performance across a wide range of industrial applications, from search and recommendation systems to generative tasks. Although scaling laws indicate that larger models generally…

Large Language Models (LLMs) have achieved remarkable success across a wide range of natural language tasks, and recent efforts have sought to extend their capabilities to multimodal domains and resource-constrained environments. However,…

Machine Learning · Computer Science 2025-05-26 Yun-Da Tsai

Fine-tuning of Large Language Models (LLMs) for downstream tasks, performed on domain-specific data has shown significant promise. However, commercial use of such LLMs is limited by the high computational cost required for their deployment…

Computation and Language · Computer Science 2025-03-06 Boris Nazarov , Darya Frolova , Yackov Lubarsky , Alexei Gaissinski , Pavel Kisilev

Small Language models (SLMs) offer an efficient and accessible alternative to Large Language Models (LLMs), delivering strong performance while using far fewer resources. We introduce a simple and effective framework for pretraining SLMs…

Small language models (SLMs), despite their widespread adoption in modern smart devices, have received significantly less academic attention compared to their large language model (LLM) counterparts, which are predominantly deployed in data…

Computation and Language · Computer Science 2025-02-27 Zhenyan Lu , Xiang Li , Dongqi Cai , Rongjie Yi , Fangming Liu , Xiwen Zhang , Nicholas D. Lane , Mengwei Xu

Small Language Models (SLMs) have gained substantial attention due to their ability to execute diverse language tasks successfully while using fewer computer resources. These models are particularly ideal for deployment in limited…

Computation and Language · Computer Science 2025-05-30 Tanjil Hasan Sakib , Md. Tanzib Hosain , Md. Kishor Morol

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Large Language Models (LLMs) possess outstanding capabilities in addressing various natural language processing (NLP) tasks. However, the sheer size of these models poses challenges in terms of storage, training and inference due to the…

Computation and Language · Computer Science 2025-04-18 Shuzhou Yuan , Ercong Nie , Bolei Ma , Michael Färber

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

Large language models (LLMs) enable researchers to analyze text at unprecedented scale and minimal cost. Researchers can now revisit old questions and tackle novel ones with rich data. We provide an econometric framework for realizing this…

Econometrics · Economics 2025-12-08 Jens Ludwig , Sendhil Mullainathan , Ashesh Rambachan

[Context and motivation] Large language models (LLMs) show notable results in natural language processing (NLP) tasks for requirements engineering (RE). However, their use is compromised by high computational cost, data sharing risks, and…

Software Engineering · Computer Science 2025-10-27 Mohammad Amin Zadenoori , Vincenzo De Martino , Jacek Dabrowski , Xavier Franch , Alessio Ferrari

The advent of Large Language Models (LLMs) has raised concerns about their enormous carbon footprint, starting with energy-intensive training and continuing through repeated inference. This study investigates the potential of using…

Computation and Language · Computer Science 2026-01-15 Anandita Garg , Uma Gaba , Deepan Muthirayan , Anish Roy Chowdhury

When using supervised fine-tuning (SFT) to adapt large language models (LLMs) to specific domains, a significant challenge arises: should we use the entire SFT dataset for fine-tuning? Common practice often involves fine-tuning directly on…

Computation and Language · Computer Science 2025-05-26 Xiang Liu , Zhaoxiang Liu , Peng Wang , Kohou Wang , Huan Hu , Kai Wang , Shiguo Lian

The usual way to interpret language models (LMs) is to test their performance on different benchmarks and subsequently infer their internal processes. In this paper, we present an alternative approach, concentrating on the quality of LM…

Computation and Language · Computer Science 2024-06-11 Lucas Weber , Jaap Jumelet , Elia Bruni , Dieuwke Hupkes

Large language models (LLMs) are renowned for their extensive linguistic knowledge and strong generalization capabilities, but their high computational demands make them unsuitable for resource-constrained environments. In contrast, small…

Computation and Language · Computer Science 2025-06-10 Kyeonghyun Kim , Jinhee Jang , Juhwan Choi , Yoonji Lee , Kyohoon Jin , YoungBin Kim

Fine-tuning large language models (LLMs) with limited data poses a practical challenge in low-resource languages, specialized domains, and constrained deployment settings. While pre-trained LLMs provide strong foundations, effective…

Computation and Language · Computer Science 2025-10-29 Marton Szep , Daniel Rueckert , Rüdiger von Eisenhart-Rothe , Florian Hinterwimmer

While Large Language Models (LLMs) have demonstrated exceptional multitasking abilities, fine-tuning these models on downstream, domain-specific datasets is often necessary to yield superior performance on test sets compared to their…

Computation and Language · Computer Science 2024-03-15 Haoran Yang , Yumeng Zhang , Jiaqi Xu , Hongyuan Lu , Pheng Ann Heng , Wai Lam

Generative large language models (LLMs) are a promising alternative to pre-trained language models for entity matching due to their high zero-shot performance and ability to generalize to unseen entities. Existing research on using LLMs for…

Computation and Language · Computer Science 2025-05-22 Aaron Steiner , Ralph Peeters , Christian Bizer

Large language models have become extremely popular recently due to their ability to achieve strong performance on a variety of tasks, such as text generation and rewriting, but their size and computation cost make them difficult to access,…

Computation and Language · Computer Science 2026-01-08 Anthony Lamelas

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the…

Computation and Language · Computer Science 2024-04-12 Linyi Yang , Shuibai Zhang , Zhuohao Yu , Guangsheng Bao , Yidong Wang , Jindong Wang , Ruochen Xu , Wei Ye , Xing Xie , Weizhu Chen , Yue Zhang
‹ Prev 1 2 3 10 Next ›