English
Related papers

Related papers: Task Oriented In-Domain Data Augmentation

200 papers

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could…

Computation and Language · Computer Science 2025-02-03 Yaping Chai , Haoran Xie , Joe S. Qin

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before…

Fine-tuning on task-specific question-answer pairs is a predominant method for enhancing the performance of instruction-tuned large language models (LLMs) on downstream tasks. However, in certain specialized domains, such as healthcare or…

Computation and Language · Computer Science 2024-10-18 Shuyang Jiang , Yusheng Liao , Ya Zhang , Yanfeng Wang , Yu Wang

Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the…

Computation and Language · Computer Science 2020-05-07 Suchin Gururangan , Ana Marasović , Swabha Swayamdipta , Kyle Lo , Iz Beltagy , Doug Downey , Noah A. Smith

With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation. However, the quality of augmented data depends heavily on…

Computation and Language · Computer Science 2024-04-30 Yichuan Li , Kaize Ding , Jianling Wang , Kyumin Lee

Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade LLMs performance, while carefully selecting…

Computation and Language · Computer Science 2026-03-16 Xin Chen , Junchao Wu , Shu Yang , Runzhe Zhan , Zeyu Wu , Min Yang , Shujian Huang , Lidia S. Chao , Derek F. Wong

Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with environments, a critical step toward generalized AI. However, the standard supervised fine-tuning (SFT) approach, which relies on large-scale…

Computation and Language · Computer Science 2025-08-27 Junjie Ye , Yilong Wu , Sixian Li , Yuming Yang , Zhiheng Xi , Tao Gui , Qi Zhang , Xuanjing Huang , Peng Wang , Zhongchao Shi , Jianping Fan , Zhengyin Du

We propose a novel task-agnostic in-domain pre-training method that sits between generic pre-training and fine-tuning. Our approach selectively masks in-domain keywords, i.e., words that provide a compact representation of the target…

Computation and Language · Computer Science 2023-07-17 Shahriar Golchin , Mihai Surdeanu , Nazgol Tavabi , Ata Kiapour

Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have…

Computation and Language · Computer Science 2024-05-21 Sanchit Sinha , Yuguang Yue , Victor Soto , Mayank Kulkarni , Jianhua Lu , Aidong Zhang

This paper introduces a simple and scalable approach to improve the data efficiency of large language model (LLM) training by augmenting existing text data with thinking trajectories. The compute for pre-training LLMs has been growing at an…

Computation and Language · Computer Science 2025-10-20 Liang Wang , Nan Yang , Shaohan Huang , Li Dong , Furu Wei

Domain adaptation for large neural language models (NLMs) is coupled with massive amounts of unstructured data in the pretraining phase. In this study, however, we show that pretrained NLMs learn in-domain information more effectively and…

Computation and Language · Computer Science 2022-08-30 Shahriar Golchin , Mihai Surdeanu , Nazgol Tavabi , Ata Kiapour

Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. In contrast, traditional adaptation approaches, such as fine-tuning, modify the…

Machine Learning · Computer Science 2023-06-14 Kush Bhatia , Avanika Narayan , Christopher De Sa , Christopher Ré

The static ``train then deploy" paradigm fundamentally limits Large Language Models (LLMs) from dynamically adapting their weights in response to continuous streams of new information inherent in real-world tasks. Test-Time Training (TTT)…

Machine Learning · Computer Science 2026-04-08 Guhao Feng , Shengjie Luo , Kai Hua , Ge Zhang , Di He , Wenhao Huang , Tianle Cai

The application of large language models (LLMs) in domain-specific contexts, including finance, has expanded rapidly. Domain-specific LLMs are typically evaluated based on their performance in various downstream tasks relevant to the…

Artificial Intelligence · Computer Science 2024-12-06 Meni Brief , Oded Ovadia , Gil Shenderovitz , Noga Ben Yoash , Rachel Lemberg , Eitam Sheetrit

Specializing LLMs in various domain-specific tasks has emerged as a critical step towards achieving high performance. However, the construction and annotation of datasets in specific domains are always very costly. Apart from using superior…

Computation and Language · Computer Science 2024-12-09 Yuanhao Yue , Chengyu Wang , Jun Huang , Peng Wang

Large language models (LLMs) often exhibit limited performance on domain-specific tasks due to the natural disproportionate representation of specialized information in their training data and the static nature of these datasets. Knowledge…

Computation and Language · Computer Science 2025-09-30 Chaojun Nie , Jun Zhou , Guanxiang Wang , Shisong Wu , Zichen Wang

Large language models (LLMs) have shown great potential in domain-specific machine translation (MT). However, one major issue is that LLMs pre-trained on general domain corpus might not generalize well to specific domains due to the lack of…

Computation and Language · Computer Science 2024-12-18 Jiawei Zheng , Hanghai Hong , Feiyan Liu , Xiaoli Wang , Jingsong Su , Yonggui Liang , Shikai Wu

While Large Language Models (LLMs) have exhibited remarkable emergent capabilities through extensive pre-training, they still face critical limitations in generalizing to specialized domains and handling diverse linguistic variations, known…

Computation and Language · Computer Science 2025-05-28 Jinwu Hu , Zhitian Zhang , Guohao Chen , Xutao Wen , Chao Shuai , Wei Luo , Bin Xiao , Yuanqing Li , Mingkui Tan

In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has…

Computation and Language · Computer Science 2023-06-14 Zhengxiang Shi , Aldo Lipani

Over recent years, an increasing amount of compute and data has been poured into training large language models (LLMs), usually by doing one-pass learning on as many tokens as possible randomly selected from large-scale web corpora. While…

Computation and Language · Computer Science 2023-08-24 Kushal Tirumala , Daniel Simig , Armen Aghajanyan , Ari S. Morcos
‹ Prev 1 2 3 10 Next ›