English
Related papers

Related papers: EpiCoDe: Boosting Model Performance Beyond Trainin…

200 papers

Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve…

Computation and Language · Computer Science 2024-04-16 Souvik Das , Lifeng Jin , Linfeng Song , Haitao Mi , Baolin Peng , Dong Yu

Augmenting Large Language Models (LLMs) with retrieved external knowledge has proven effective for improving the factual accuracy of generated responses. Despite their success, retrieval-augmented LLMs still face the distractibility issue,…

Computation and Language · Computer Science 2025-02-18 Zexuan Qiu , Zijing Ou , Bin Wu , Jingjing Li , Aiwei Liu , Irwin King

Large Language Models (LLMs) require instruction fine-tuning to perform different downstream tasks. However, the instruction fine-tuning phase still demands significant computational resources and labeled data, lacking a paradigm that can…

Computation and Language · Computer Science 2025-03-10 Yiguan Lin , Bin Xu , Yinghao Li , Yang Gao

Large language models (LLMs) are trained on huge amounts of textual data, and concerns have been raised that the limits of such data may soon be reached. A potential solution is to train on synthetic data sampled from LLMs. In this work, we…

Computation and Language · Computer Science 2025-10-10 Jannek Ulm , Kevin Du , Vésteinn Snæbjarnarson

Large Language Models (LLMs) have so far impressed the world, with unprecedented capabilities that emerge in models at large scales. On the vision side, transformer models (i.e., ViT) are following the same trend, achieving the best…

Computer Vision and Pattern Recognition · Computer Science 2023-10-30 Mustafa Shukor , Corentin Dancette , Matthieu Cord

Large language models (LLMs) tend to inadequately integrate input context during text generation, relying excessively on encoded prior knowledge in model parameters, potentially resulting in generated text with factual inconsistencies or…

Computation and Language · Computer Science 2024-05-07 Zheng Zhao , Emilio Monti , Jens Lehmann , Haytham Assem

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

In recent years, the use of large language models (LLMs) for text classification has attracted widespread attention. Despite this, the classification accuracy of LLMs has not yet universally surpassed that of smaller models. LLMs can…

Computation and Language · Computer Science 2024-12-11 Min Zeng , Caiquan Liu , Shiqi Zhang , Li Xie , Chen Sang , Xiaoxin Chen

Speculative decoding (SD) accelerates large language model (LLM) reasoning by using a small draft model to generate candidate tokens, which the target LLM either accepts directly or regenerates upon rejection. However, excessive alignment…

Computation and Language · Computer Science 2026-01-01 Tiancheng Su , Meicong Zhang , Guoxiu He

Using responses generated by high-performing large language models (LLMs) for instruction tuning has become a widely adopted approach. However, the existing literature overlooks a property of LLM-generated responses: they conflate world…

Computation and Language · Computer Science 2026-04-16 Tatsuya Ichinose , Youmi Ma , Masanari Oi , Ryuto Koike , Naoaki Okazaki

Large Language Models (LLMs) have demonstrated remarkable capabilities in code editing, substantially enhancing software development productivity. However, the inherent complexity of code editing tasks forces existing approaches to rely on…

Software Engineering · Computer Science 2025-10-01 Peiding Wang , Li Zhang , Fang Liu , Yinghao Zhu , Wang Xu , Lin Shi , Xiaoli Lian , Minxiao Li , Bo Shen , An Fu

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both…

With the rapid progress of large language models (LLMs), reliably evaluating the capabilities of pre-trained LLMs has become increasingly important. The challenge is that base pre-trained models are optimized for next-token prediction and…

Computation and Language · Computer Science 2026-05-28 Shaobo Wang , Guo Chen , Ziyue Wang , Zhengyang Tang , Qingyang Liu , Xingzhang Ren , Dayiheng Liu , Linfeng Zhang

Given the high computational cost of preference alignment training of large language models (LLMs), exploring efficient methods to reduce the training overhead remains an important and compelling research problem. Motivated by the…

Machine Learning · Computer Science 2025-06-02 Chujie Zheng , Ziqi Wang , Heng Ji , Minlie Huang , Nanyun Peng

Domain-specific large language models (LLMs), typically developed by fine-tuning a pre-trained general-purpose LLM on specialized datasets, represent a significant advancement in applied AI. A common strategy in LLM fine-tuning is…

Machine Learning · Computer Science 2026-01-08 Jing-Cheng Pang , Liu Sun , Chang Zhou , Xian Tang , Haichuan Ma , Kun Jiang , Jianlong Wang , Kai Zhang , Sijie Wu , Haoran Cai , Chenwei Wu , Xubin Li , Xin Chen

Large Language Models (LLMs) have demonstrated exceptional abilities across a broad range of language-related tasks, including generating solutions to complex reasoning problems. An effective technique to enhance LLM performance is…

Computation and Language · Computer Science 2024-12-25 Shuzhang Cai , Twumasi Mensah-Boateng , Xander Kuksov , Jing Yuan , Shaojie Tang

Large language models are trained on massive scrapes of the web, as required by current scaling laws. Most progress is made for English, given its abundance of high-quality pretraining data. For most other languages, however, such high…

Computation and Language · Computer Science 2025-02-07 Skyler Seto , Maartje ter Hoeve , Richard He Bai , Natalie Schluter , David Grangier

Speculative decoding accelerates large language model (LLM) inference by using a small draft model to generate candidate tokens for a larger target model to verify. The efficacy of this technique hinges on the trade-off between the time…

Computation and Language · Computer Science 2026-03-03 Jiebin Zhang , Zhenghan Yu , Liang Wang , Nan Yang , Eugene J. Yu , Zheng Li , Yifan Song , Dawei Zhu , Xingxing Zhang , Furu Wei , Sujian Li

Test-time scaling has emerged as a powerful paradigm for enhancing the reasoning capabilities of large language models (LLMs) by allocating additional computational resources during inference. However, this paradigm is inherently…

Computation and Language · Computer Science 2025-09-08 Shengyin Sun , Yiming Li , Xing Li , Yingzhao Lian , Weizhe Lin , Hui-Ling Zhen , Zhiyuan Yang , Chen Chen , Xianzhi Yu , Mingxuan Yuan , Chen Ma

Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models (LLMs) while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive…

Computation and Language · Computer Science 2025-06-17 Vinith M. Suriyakumar , Ayush Sekhari , Ashia Wilson
‹ Prev 1 2 3 10 Next ›