Related papers: Approximately Aligned Decoding

Grammar-Aligned Decoding

Large Language Models (LLMs) struggle with reliably generating highly structured outputs, such as program code, mathematical formulas, or well-formed markup. Constrained decoding approaches mitigate this problem by greedily restricting what…

Artificial Intelligence · Computer Science 2025-12-15 Kanghee Park , Jiayu Wang , Taylor Berg-Kirkpatrick , Nadia Polikarpova , Loris D'Antoni

Adaptive Draft-Verification for Efficient Large Language Model Decoding

Large language model (LLM) decoding involves generating a sequence of tokens based on a given context, where each token is predicted one at a time using the model's learned probabilities. The typical autoregressive decoding method requires…

Computation and Language · Computer Science 2024-08-20 Xukun Liu , Bowen Lei , Ruqi Zhang , Dongkuan Xu

Accelerating Diffusion LLMs via Adaptive Parallel Decoding

The generation speed of LLMs are bottlenecked by autoregressive decoding, where tokens are predicted sequentially one by one. Alternatively, diffusion large language models (dLLMs) theoretically allow for parallel token generation, but in…

Computation and Language · Computer Science 2025-11-03 Daniel Israel , Guy Van den Broeck , Aditya Grover

APCD: Adaptive Path-Contrastive Decoding for Reliable Large Language Model Generation

Large language models (LLMs) often suffer from hallucinations due to error accumulation in autoregressive decoding, where suboptimal early token choices misguide subsequent generation. Although multi-path decoding can improve robustness by…

Computation and Language · Computer Science 2026-05-21 Tianyu Zheng , Hong Wu , Jiaji Zhong

Safety Alignment of Large Language Models via Contrasting Safe and Harmful Distributions

With the widespread application of Large Language Models (LLMs), it has become a significant concern to ensure their safety and prevent harmful responses. While current safe-alignment methods based on instruction fine-tuning and…

Computation and Language · Computer Science 2025-12-16 Xiaoyun Zhang , Zhengyue Zhao , Wenxuan Shi , Kaidi Xu , Di Huang , Xing Hu

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

Large language models (LLMs) have progressed rapidly in complex reasoning and question answering, yet LLM hallucination remains a central bottleneck that hinders practical deployment, especially for commercial black-box LLMs accessible only…

Computation and Language · Computer Science 2026-05-08 Huizi Cui , Huan Ma , Qilin Wang , Yuhang Gao , Changqing Zhang

Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning

Aligning Large Language Models (LLMs) is crucial for enhancing their safety and utility. However, existing methods, primarily based on preference datasets, face challenges such as noisy labels, high annotation costs, and privacy concerns.…

Machine Learning · Computer Science 2025-01-28 Hao Sun , Mihaela van der Schaar

AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference

Large language models (LLMs) have achieved remarkable performance across a wide range of tasks, but their increasing parameter sizes significantly slow down inference. Speculative decoding mitigates this issue by leveraging a smaller draft…

Computation and Language · Computer Science 2026-05-27 Kuan-Wei Lu , Ding-Yong Hong , Pangfeng Liu , Jan-Jan Wu

Attribution-Guided Decoding

The capacity of Large Language Models (LLMs) to follow complex instructions and generate factually accurate text is critical for their real-world application. However, standard decoding methods often fail to robustly satisfy these…

Machine Learning · Computer Science 2026-03-18 Piotr Komorowski , Elena Golimblevskaia , Reduan Achtibat , Thomas Wiegand , Sebastian Lapuschkin , Wojciech Samek

AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism

Large language models (LLMs) are increasingly used for long-content generation (e.g., long Chain-of-Thought reasoning) where decoding efficiency becomes a critical bottleneck: Autoregressive decoding is inherently limited by its sequential…

Computation and Language · Computer Science 2025-06-05 Zhepei Wei , Wei-Lin Chen , Xinyu Zhu , Yu Meng

Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -

Despite recent advances in Large Vision Language Models (LVLMs), these models still suffer from generating hallucinatory responses that do not align with the visual input provided. To mitigate such hallucinations, we introduce Efficient…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Laura Fieback , Nishilkumar Balar , Jakob Spiegelberg , Hanno Gottschalk

Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference

Large language models (LLMs) have achieved remarkable success across diverse tasks, yet their inference processes are hindered by substantial time and energy demands due to single-token generation at each decoding step. While previous…

Computation and Language · Computer Science 2025-04-11 Zongyue Qin , Ziniu Hu , Zifan He , Neha Prakriya , Jason Cong , Yizhou Sun

Speculative Contrastive Decoding

Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias. Inspired by speculative…

Computation and Language · Computer Science 2024-03-14 Hongyi Yuan , Keming Lu , Fei Huang , Zheng Yuan , Chang Zhou

Alignment Imprint: Zero-Shot AI-Generated Text Detection via Provable Preference Discrepancy

Detecting AI-generated text is an important but challenging problem. Existing likelihood-based detection methods are often sensitive to content complexity and may exhibit unstable performance. In this paper, our key insight is that modern…

Artificial Intelligence · Computer Science 2026-04-21 Junxi Wu , Kailin Huang , Dongjian Hu , Bin Chen , Hao Wu , Shu-Tao Xia , Changliang Zou

AdapTrack: Constrained Decoding without Distorting LLM's Output Intent

Language model-based code generation and completion tools have been widely adopted, but they may sometimes produce code that does not meet necessary constraints, such as syntactic correctness or API existence. Constrained decoding…

Software Engineering · Computer Science 2025-10-21 Yongmin Li , Jia Li , Ge Li , Zhi Jin

Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM

Contrastive decoding (CD) (Li et al., 2023) improves the next-token distribution of a large expert language model (LM) using a small amateur LM. Although CD is applied to various LMs and domains to enhance open-ended text generation, it is…

Computation and Language · Computer Science 2024-11-05 Haw-Shiuan Chang , Nanyun Peng , Mohit Bansal , Anil Ramakrishna , Tagyoung Chung

Alignment-Aware Decoding

Alignment of large language models remains a central challenge in natural language processing. Preference optimization has emerged as a popular and effective method for improving alignment, typically through training-time or prompt-based…

Machine Learning · Computer Science 2025-10-01 Frédéric Berdoz , Luca A. Lanzendörfer , René Caky , Roger Wattenhofer

Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning

Speculative decoding accelerates large language model (LLM) inference by using a small draft model to generate candidate tokens for a larger target model to verify. The efficacy of this technique hinges on the trade-off between the time…

Computation and Language · Computer Science 2026-03-03 Jiebin Zhang , Zhenghan Yu , Liang Wang , Nan Yang , Eugene J. Yu , Zheng Li , Yifan Song , Dawei Zhu , Xingxing Zhang , Furu Wei , Sujian Li

ELAD: Explanation-Guided Large Language Models Active Distillation

The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs…

Computation and Language · Computer Science 2024-11-21 Yifei Zhang , Bo Pan , Chen Ling , Yuntong Hu , Liang Zhao

Alignment Adapter to Improve the Performance of Compressed Deep Learning Models

Compressed Deep Learning (DL) models are essential for deployment in resource-constrained environments. But their performance often lags behind their large-scale counterparts. To bridge this gap, we propose Alignment Adapter (AlAd): a…

Machine Learning · Computer Science 2026-02-17 Rohit Raj Rai , Abhishek Dhaka , Amit Awekar