English
Related papers

Related papers: Tiny language models

200 papers

Large Language Models (LLMs), originally developed for natural language processing (NLP), have demonstrated the potential to generalize across modalities and domains. With their in-context learning (ICL) capabilities, LLMs can perform…

Artificial Intelligence · Computer Science 2025-08-26 Nikolaos Pavlidis , Vasilis Perifanis , Symeon Symeonidis , Pavlos S. Efraimidis

Natural language processing (NLP) enables the understanding and generation of meaningful human language, typically using a pre-trained complex architecture on a large dataset to learn the language and next fine-tune its weights to implement…

Computation and Language · Computer Science 2025-09-04 Yarden Tzach , Ronit D. Gross , Ella Koresh , Shalom Rosner , Or Shpringer , Tal Halevi , Ido Kanter

Large language models (LLMs) have shown exceptional performance on a variety of natural language tasks. Yet, their capabilities for HTML understanding -- i.e., parsing the raw HTML of a webpage, with applications to automation of web-based…

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of…

Computation and Language · Computer Science 2023-07-27 Tong Guo

Large Language Models (LLMs) and pre-trained Language Models (LMs) have achieved impressive success on many software engineering tasks (e.g., code completion and code generation). By leveraging huge existing code corpora (e.g., GitHub),…

Software Engineering · Computer Science 2025-01-16 Xin Yin , Chao Ni , Xiaodan Xu , Xinrui Li , Xiaohu Yang

Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach,…

Large Language Models (LLMs) have demonstrated remarkable performance across various natural language tasks, marking significant strides towards general artificial intelligence. While general artificial intelligence is leveraged by…

Computation and Language · Computer Science 2023-10-31 Yizhe Yang , Huashan Sun , Jiawei Li , Runheng Liu , Yinghao Li , Yuhang Liu , Heyan Huang , Yang Gao

Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train. We propose a simple and efficient learning framework, TLM, that does not rely on large-scale…

Computation and Language · Computer Science 2022-07-25 Xingcheng Yao , Yanan Zheng , Xiaocong Yang , Zhilin Yang

Trained on the large corpus, pre-trained language models (PLMs) can capture different levels of concepts in context and hence generate universal language representations. They can benefit multiple downstream natural language processing…

Computation and Language · Computer Science 2021-10-15 Nankai Lin , Yingwen Fu , Chuwei Chen , Ziyu Yang , Shengyi Jiang

Large Language Models (LLMs), typified by OpenAI's GPT, have marked a significant advancement in artificial intelligence. Trained on vast amounts of text data, LLMs are capable of understanding and generating human-like text across a…

Artificial Intelligence · Computer Science 2024-10-29 Haochen Zhang , Yuyang Dong , Chuan Xiao , Masafumi Oyamada

Pre-trained Large Language Models (LLMs) have shown success in a diverse set of language inference and understanding tasks. The pre-training stage of LLMs looks at a large corpus of raw textual data. The BabyLM shared task compares LLM…

Computation and Language · Computer Science 2024-01-11 Khushi Bhardwaj , Raj Sanjay Shah , Sashank Varma

Unlocking the potential of Large Language Models (LLMs) in data classification represents a promising frontier in natural language processing. In this work, we evaluate the performance of different LLMs in comparison with state-of-the-art…

Computation and Language · Computer Science 2025-01-16 Arina Kostina , Marios D. Dikaiakos , Dimosthenis Stefanidis , George Pallis

The rapid advancement of artificial intelligence, particularly with the development of Large Language Models (LLMs) built on the transformer architecture, has redefined the capabilities of natural language processing. These models now…

Computation and Language · Computer Science 2025-02-11 Andrea Matarazzo , Riccardo Torlone

Large Language Models (LLMs) exhibit a puzzling disparity in their formal linguistic competence: while they learn some linguistic phenomena with near-perfect mastery, they often perform below chance on others, even after training on…

Computation and Language · Computer Science 2026-04-21 H S V N S Kowndinya Renduchintala , Sumit Bhatia

While Large Language Models (LLMs) have exhibited remarkable emergent capabilities through extensive pre-training, they still face critical limitations in generalizing to specialized domains and handling diverse linguistic variations, known…

Computation and Language · Computer Science 2025-05-28 Jinwu Hu , Zhitian Zhang , Guohao Chen , Xutao Wen , Chao Shuai , Wei Luo , Bin Xiao , Yuanqing Li , Mingkui Tan

Large language models (LLMs) have demonstrated emergent abilities in text generation, question answering, and reasoning, facilitating various tasks and domains. Despite their proficiency in various tasks, LLMs like PaLM 540B and Llama-3.1…

Computation and Language · Computer Science 2024-12-31 Fali Wang , Zhiwei Zhang , Xianren Zhang , Zongyu Wu , Tzuhao Mo , Qiuhao Lu , Wanjing Wang , Rui Li , Junjie Xu , Xianfeng Tang , Qi He , Yao Ma , Ming Huang , Suhang Wang

Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed the complexity of conventional neural…

Machine Learning · Computer Science 2024-12-05 Minghao Shao , Abdul Basit , Ramesh Karri , Muhammad Shafique

Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require. In this work, we investigate two techniques for training…

Computation and Language · Computer Science 2023-01-06 Luke Gessler , Amir Zeldes

The impressive performance of large language models (LLMs) has led to their consideration as models of human language processing. Instead, we suggest that the success of LLMs arises from the flexibility of the transformer learning…

Computation and Language · Computer Science 2024-11-19 Xiaoliang Luo , Michael Ramscar , Bradley C. Love

In the rapidly evolving field of Explainable Natural Language Processing (NLP), textual explanations, i.e., human-like rationales, are pivotal for explaining model predictions and enriching datasets with interpretable labels. Traditional…

Computation and Language · Computer Science 2025-11-12 Mahdi Dhaini , Juraj Vladika , Ege Erdogan , Zineb Attaoui , Gjergji Kasneci
‹ Prev 1 2 3 10 Next ›