English
Related papers

Related papers: LMD3: Language Model Data Density Dependence

200 papers

Large language models achieve high performance on many but not all downstream tasks. The interaction between pretraining data and task data is commonly assumed to determine this variance: a task with data that is more similar to a model's…

Computation and Language · Computer Science 2023-11-16 Gregory Yauney , Emily Reif , David Mimno

This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. Our model, developed as a part of the recent SemEval task, is based on fine-tuning individual language models (BERT,…

Computation and Language · Computer Science 2024-07-04 Kota Shamanth Ramanath Nayak , Leila Kosseim

Much of the success of modern language models depends on finding a suitable prompt to instruct the model. Until now, it has been largely unknown how variations in the linguistic expression of prompts affect these models. This study…

Computation and Language · Computer Science 2026-02-17 Jan Philip Wahle , Terry Ruas , Yang Xu , Bela Gipp

Language model-based instruction-following systems have lately shown increasing performance on many benchmark tasks, demonstrating the capability of adapting to a broad variety of instructions. However, such systems are often not designed…

Computation and Language · Computer Science 2024-03-20 Rahul Nadkarni , Yizhong Wang , Noah A. Smith

Neural dependency parsing has achieved remarkable performance for many domains and languages. The bottleneck of massive labeled data limits the effectiveness of these approaches for low resource languages. In this work, we focus on…

Computation and Language · Computer Science 2021-04-13 Jivnesh Sandhan , Amrith Krishna , Ashim Gupta , Laxmidhar Behera , Pawan Goyal

Recently published work on rephrasing natural text data for pre-training LLMs has shown promising results when combining the original dataset with the synthetically rephrased data. We build upon previous work by replicating existing results…

Large Language Models (LLMs) are highly vulnerable to input perturbations, as even a small prompt change may result in a substantially different output. Existing methods to enhance LLM robustness are primarily focused on perturbed data…

Computation and Language · Computer Science 2025-04-04 Aryan Agrawal , Lisa Alazraki , Shahin Honarvar , Marek Rei

Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best…

Computation and Language · Computer Science 2024-09-16 Hila Gonen , Srini Iyer , Terra Blevins , Noah A. Smith , Luke Zettlemoyer

Data augmentation is an essential technique in natural language processing (NLP) for enriching training datasets by generating diverse samples. This process is crucial for improving the robustness and generalization capabilities of NLP…

Computation and Language · Computer Science 2025-10-16 Zaitian Wang , Jinghan Zhang , Xinhao Zhang , Kunpeng Liu , Pengfei Wang , Yuanchun Zhou

We investigate the performance of large language models on repetitive deterministic prediction tasks and study how the sequence accuracy rate scales with output length. Each such task involves repeating the same operation n times. Examples…

Artificial Intelligence · Computer Science 2025-11-25 Wanda Hou , Leon Zhou , Hong-Ye Hu , Yubei Chen , Yi-Zhuang You , Xiao-Liang Qi

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could…

Computation and Language · Computer Science 2025-02-03 Yaping Chai , Haoran Xie , Joe S. Qin

The programming skill is one crucial ability for Large Language Models (LLMs), necessitating a deep understanding of programming languages (PLs) and their correlation with natural languages (NLs). We examine the impact of pre-training data…

Computation and Language · Computer Science 2024-02-21 Demin Song , Honglin Guo , Yunhua Zhou , Shuhao Xing , Yudong Wang , Zifan Song , Wenwei Zhang , Qipeng Guo , Hang Yan , Xipeng Qiu , Dahua Lin

Large language models have the potential to generate explanations for their own predictions in a variety of styles based on user instructions. Recent research has examined whether these self-explanations faithfully reflect the models'…

Computation and Language · Computer Science 2025-12-09 Tomoki Doi , Masaru Isonuma , Hitomi Yanaka

Large language models are trained on massive scrapes of the web, as required by current scaling laws. Most progress is made for English, given its abundance of high-quality pretraining data. For most other languages, however, such high…

Computation and Language · Computer Science 2025-02-07 Skyler Seto , Maartje ter Hoeve , Richard He Bai , Natalie Schluter , David Grangier

While the Large Language Models (LLMs) dominate a majority of language understanding tasks, previous work shows that some of these results are supported by modelling spurious correlations of training datasets. Authors commonly assess model…

Computation and Language · Computer Science 2024-02-07 Lukáš Mikula , Michal Štefánik , Marek Petrovič , Petr Sojka

Long-context modeling capabilities are important for large language models (LLMs) in various applications. However, directly training LLMs with long context windows is insufficient to enhance this capability since some training samples do…

Computation and Language · Computer Science 2024-05-29 Longze Chen , Ziqiang Liu , Wanwei He , Yunshui Li , Run Luo , Min Yang

Training data influence estimation methods quantify the contribution of training documents to a model's output, making them a promising source of information for example-based explanations. As humans cannot interpret thousands of documents,…

Computation and Language · Computer Science 2026-04-10 Loris Schoenegger , Benjamin Roth

Language Models (LMs) pre-trained with self-supervision on large text corpora have become the default starting point for developing models for various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the…

Computation and Language · Computer Science 2023-11-03 Megh Thakkar , Tolga Bolukbasi , Sriram Ganapathy , Shikhar Vashishth , Sarath Chandar , Partha Talukdar

Traditional scaling laws in natural language processing suggest that increasing model size and training data enhances performance. However, recent studies reveal deviations, particularly in large language models, where performance…

Machine Learning · Computer Science 2025-07-16 Zhengyu Chen , Siqi Wang , Teng Xiao , Yudong Wang , Shiqi Chen , Xunliang Cai , Junxian He , Jingang Wang

We are exposed to much information trying to influence us, such as teaser messages, debates, politically framed news, and propaganda - all of which use persuasive language. With the recent interest in Large Language Models (LLMs), we study…

Computation and Language · Computer Science 2025-02-24 Amalie Brogaard Pauli , Isabelle Augenstein , Ira Assent
‹ Prev 1 2 3 10 Next ›