Related papers: EpiCoDe: Boosting Model Performance Beyond Trainin…

Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models

Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve…

Computation and Language · Computer Science 2024-04-16 Souvik Das , Lifeng Jin , Linfeng Song , Haitao Mi , Baolin Peng , Dong Yu

Entropy-Based Decoding for Retrieval-Augmented Large Language Models

Augmenting Large Language Models (LLMs) with retrieved external knowledge has proven effective for improving the factual accuracy of generated responses. Despite their success, retrieval-augmented LLMs still face the distractibility issue,…

Computation and Language · Computer Science 2025-02-18 Zexuan Qiu , Zijing Ou , Bin Wu , Jingjing Li , Aiwei Liu , Irwin King

Extrapolation Merging: Keep Improving With Extrapolation and Merging

Large Language Models (LLMs) require instruction fine-tuning to perform different downstream tasks. However, the instruction fine-tuning phase still demands significant computational resources and labeled data, lacking a paradigm that can…

Computation and Language · Computer Science 2025-03-10 Yiguan Lin , Bin Xu , Yinghao Li , Yang Gao

Contrastive Decoding for Synthetic Data Generation in Low-Resource Language Modeling

Large language models (LLMs) are trained on huge amounts of textual data, and concerns have been raised that the limits of such data may soon be reached. A potential solution is to train on synthetic data sampled from LLMs. In this work, we…

Computation and Language · Computer Science 2025-10-10 Jannek Ulm , Kevin Du , Vésteinn Snæbjarnarson

eP-ALM: Efficient Perceptual Augmentation of Language Models

Large Language Models (LLMs) have so far impressed the world, with unprecedented capabilities that emerge in models at large scales. On the vision side, transformer models (i.e., ViT) are following the same trend, achieving the best…

Computer Vision and Pattern Recognition · Computer Science 2023-10-30 Mustafa Shukor , Corentin Dancette , Matthieu Cord

Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding

Large language models (LLMs) tend to inadequately integrate input context during text generation, relying excessively on encoded prior knowledge in model parameters, potentially resulting in generated text with factual inconsistencies or…

Computation and Language · Computer Science 2024-05-07 Zheng Zhao , Emilio Monti , Jens Lehmann , Haytham Assem

Code Representation Learning At Scale

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy

In recent years, the use of large language models (LLMs) for text classification has attracted widespread attention. Despite this, the classification accuracy of LLMs has not yet universally surpassed that of smaller models. LLMs can…

Computation and Language · Computer Science 2024-12-11 Min Zeng , Caiquan Liu , Shiqi Zhang , Li Xie , Chen Sang , Xiaoxin Chen

Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning

Speculative decoding (SD) accelerates large language model (LLM) reasoning by using a small draft model to generate candidate tokens, which the target LLM either accepts directly or regenerates upon rejection. However, excessive alignment…

Computation and Language · Computer Science 2026-01-01 Tiancheng Su , Meicong Zhang , Guoxiu He

Synthesizing Instruction-Tuning Datasets with Contrastive Decoding

Using responses generated by high-performing large language models (LLMs) for instruction tuning has become a widely adopted approach. However, the existing literature overlooks a property of LLM-generated responses: they conflate world…

Computation and Language · Computer Science 2026-04-16 Tatsuya Ichinose , Youmi Ma , Masanari Oi , Ryuto Koike , Naoaki Okazaki

EfficientEdit: Accelerating Code Editing via Edit-Oriented Speculative Decoding

Large Language Models (LLMs) have demonstrated remarkable capabilities in code editing, substantially enhancing software development productivity. However, the inherent complexity of code editing tasks forces existing approaches to rely on…

Software Engineering · Computer Science 2025-10-01 Peiding Wang , Li Zhang , Fang Liu , Yinghao Zhu , Wang Xu , Lin Shi , Xiaoli Lian , Minxiao Li , Bo Shen , An Fu

ContraCLM: Contrastive Learning For Causal Language Model

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both…

Computation and Language · Computer Science 2023-05-04 Nihal Jain , Dejiao Zhang , Wasi Uddin Ahmad , Zijian Wang , Feng Nan , Xiaopeng Li , Ming Tan , Ramesh Nallapati , Baishakhi Ray , Parminder Bhatia , Xiaofei Ma , Bing Xiang

The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates

With the rapid progress of large language models (LLMs), reliably evaluating the capabilities of pre-trained LLMs has become increasingly important. The challenge is that base pre-trained models are optimized for next-token prediction and…

Computation and Language · Computer Science 2026-05-28 Shaobo Wang , Guo Chen , Ziyue Wang , Zhengyang Tang , Qingyang Liu , Xingzhang Ren , Dayiheng Liu , Linfeng Zhang

Model Extrapolation Expedites Alignment

Given the high computational cost of preference alignment training of large language models (LLMs), exploring efficient methods to reduce the training overhead remains an important and compelling research problem. Motivated by the…

Machine Learning · Computer Science 2025-06-02 Chujie Zheng , Ziqi Wang , Heng Ji , Minlie Huang , Nanyun Peng

EDCO: Dynamic Curriculum Orchestration for Domain-specific Large Language Model Fine-tuning

Domain-specific large language models (LLMs), typically developed by fine-tuning a pre-trained general-purpose LLM on specialized datasets, represent a significant advancement in applied AI. A common strategy in LLM fine-tuning is…

Machine Learning · Computer Science 2026-01-08 Jing-Cheng Pang , Liu Sun , Chang Zhou , Xian Tang , Haichuan Ma , Kun Jiang , Jianlong Wang , Kai Zhang , Sijie Wu , Haoran Cai , Chenwei Wu , Xubin Li , Xin Chen

The Power of Adaptation: Boosting In-Context Learning through Adaptive Prompting

Large Language Models (LLMs) have demonstrated exceptional abilities across a broad range of language-related tasks, including generating solutions to complex reasoning problems. An effective technique to enhance LLM performance is…

Computation and Language · Computer Science 2024-12-25 Shuzhang Cai , Twumasi Mensah-Boateng , Xander Kuksov , Jing Yuan , Shaojie Tang

Training Bilingual LMs with Data Constraints in the Targeted Language

Large language models are trained on massive scrapes of the web, as required by current scaling laws. Most progress is made for English, given its abundance of high-quality pretraining data. For most other languages, however, such high…

Computation and Language · Computer Science 2025-02-07 Skyler Seto , Maartje ter Hoeve , Richard He Bai , Natalie Schluter , David Grangier

Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning

Speculative decoding accelerates large language model (LLM) inference by using a small draft model to generate candidate tokens for a larger target model to verify. The efficacy of this technique hinges on the trade-off between the time…

Computation and Language · Computer Science 2026-03-03 Jiebin Zhang , Zhenghan Yu , Liang Wang , Nan Yang , Eugene J. Yu , Zheng Li , Yifan Song , Dawei Zhu , Xingxing Zhang , Furu Wei , Sujian Li

Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling

Test-time scaling has emerged as a powerful paradigm for enhancing the reasoning capabilities of large language models (LLMs) by allocating additional computational resources during inference. However, this paradigm is inherently…

Computation and Language · Computer Science 2025-09-08 Shengyin Sun , Yiming Li , Xing Li , Yingzhao Lian , Weizhe Lin , Hui-Ling Zhen , Zhiyuan Yang , Chen Chen , Xianzhi Yu , Mingxuan Yuan , Chen Ma

UCD: Unlearning in LLMs via Contrastive Decoding

Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models (LLMs) while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive…

Computation and Language · Computer Science 2025-06-17 Vinith M. Suriyakumar , Ayush Sekhari , Ashia Wilson