English
Related papers

Related papers: Super Tiny Language Models

200 papers

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Small language models (SLMs), despite their widespread adoption in modern smart devices, have received significantly less academic attention compared to their large language model (LLM) counterparts, which are predominantly deployed in data…

Computation and Language · Computer Science 2025-02-27 Zhenyan Lu , Xiang Li , Dongqi Cai , Rongjie Yi , Fangming Liu , Xiwen Zhang , Nicholas D. Lane , Mengwei Xu

As foundation AI models continue to increase in size, an important question arises - is massive scale the only path forward? This survey of about 160 papers presents a family of Small Language Models (SLMs) in the 1 to 8 billion parameter…

Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks. However, the expensive memory and computational requirements present significant…

Artificial Intelligence · Computer Science 2025-11-13 Ruihao Gong , Yifu Ding , Zining Wang , Chengtao Lv , Xingyu Zheng , Jinyang Du , Haotong Qin , Jinyang Guo , Michele Magno , Xianglong Liu

Tokenization is a fundamental component of large language models (LLMs), yet its influence on model scaling and performance is not fully explored. In this paper, we introduce Over-Tokenized Transformers, a novel framework that decouples…

Computation and Language · Computer Science 2025-05-26 Hongzhi Huang , Defa Zhu , Banggu Wu , Yutao Zeng , Ya Wang , Qiyang Min , Xun Zhou

Recent advancements in large language models (LLMs) have remarkably enhanced performances on a variety of tasks in multiple languages. However, tokenizers in LLMs trained primarily on English-centric corpora often overly fragment a text…

Computation and Language · Computer Science 2024-08-07 Jimin Hong , Gibbeum Lee , Jaewoong Cho

Small Language Models (SLMs) have gained substantial attention due to their ability to execute diverse language tasks successfully while using fewer computer resources. These models are particularly ideal for deployment in limited…

Computation and Language · Computer Science 2025-05-30 Tanjil Hasan Sakib , Md. Tanzib Hosain , Md. Kishor Morol

This paper presents novel systems and methodologies for the development of efficient large language models (LLMs). It explores the trade-offs between model size, performance, and computational resources, with the aim of maximizing the…

Computation and Language · Computer Science 2023-09-14 Sia Gholami , Marwan Omar

In recent years, large language models (LLMs) have achieved remarkable success in natural language processing (NLP). LLMs require an extreme amount of parameters to attain high performance. As models grow into the trillion-parameter range,…

Computation and Language · Computer Science 2024-09-10 Zhyar Rzgar K Rostam , Sándor Szénási , Gábor Kertész

While large language models have facilitated breakthroughs in many applications of artificial intelligence, their inherent largeness makes them computationally expensive and challenging to deploy in resource-constrained settings. In this…

Speech tokenization serves as the foundation of speech language model (LM), enabling them to perform various tasks such as spoken language modeling, text-to-speech, speech-to-text, etc. Most speech tokenizers are trained independently of…

Computation and Language · Computer Science 2024-09-11 Arnon Turetzky , Yossi Adi

Spoken language models (SLMs) typically discretize speech into high-frame-rate tokens extracted from SSL speech models. As the most successful LMs are based on the Transformer architecture, processing these long token streams with…

Computation and Language · Computer Science 2026-02-05 Nicholas Lee , Cheol Jun Cho , Alan W Black , Gopala K. Anumanchipalli

The surprising ability of Large Language Models (LLMs) to perform well on complex reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very large-scale models (100+ billion parameters). We show that such…

Computation and Language · Computer Science 2023-01-31 Yao Fu , Hao Peng , Litu Ou , Ashish Sabharwal , Tushar Khot

Recent works have shown a surprising result: a small fraction of Large Language Model (LLM) parameter outliers are disproportionately important to the quality of the model. LLMs contain billions of parameters, so these small fractions, such…

Computation and Language · Computer Science 2025-07-08 Mengxia Yu , De Wang , Qi Shan , Colorado J Reed , Alvin Wan

The ongoing evolution of language models has led to the development of large-scale architectures that demonstrate exceptional performance across a wide range of tasks. However, these models come with significant computational and energy…

Artificial Intelligence · Computer Science 2025-11-19 Xialie Zhuang , Peixian Ma , Zhikai Jia , Zane Cao , Shiwei Liu

Large Language Models (LLMs) have been emerging as prominent AI models for solving many natural language tasks due to their high performance (e.g., accuracy) and capabilities in generating high-quality responses to the given inputs.…

Neural and Evolutionary Computing · Computer Science 2026-04-22 Rachmad Vidya Wicaksana Putra , Pasindu Wickramasinghe , Muhammad Shafique

The recent advancements of Small Language Models (SLMs) have opened new possibilities for efficient code generation. SLMs offer lightweight and cost-effective alternatives to Large Language Models (LLMs), making them attractive for use in…

Software Engineering · Computer Science 2026-01-21 Md Mahade Hasan , Muhammad Waseem , Kai-Kristian Kemell , Jussi Rasku , Juha Ala-Rantala , Pekka Abrahamsson

Large language models (LLMs) show best-in-class performance across a wide range of natural language processing applications. Training these models is an extremely computationally expensive task; frontier Artificial Intelligence (AI)…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-10 Alexander Interrante-Grant , Carla Varela-Rosa , Suhaas Narayan , Chris Connelly , Albert Reuther

Speech Language Models (SLMs) aim to learn language from raw audio, without textual resources. Despite significant advances, our current models exhibit weak syntax and semantic abilities. However, if the scaling properties of neural…

Audio and Speech Processing · Electrical Eng. & Systems 2024-12-13 Santiago Cuervo , Ricard Marxer

Fueled by their remarkable ability to tackle diverse tasks across multiple domains, large language models (LLMs) have grown at an unprecedented rate, with some recent models containing trillions of parameters. This growth is accompanied by…

Machine Learning · Computer Science 2025-05-30 Athanasios Glentis , Jiaxiang Li , Qiulin Shang , Andi Han , Ioannis Tsaknakis , Quan Wei , Mingyi Hong
‹ Prev 1 2 3 10 Next ›