Related papers: Constrained Adaptive Rejection Sampling

Adaptive Rejection Sampling with fixed number of nodes

The adaptive rejection sampling (ARS) algorithm is a universal random generator for drawing samples efficiently from a univariate log-concave target probability density function (pdf). ARS generates independent samples from the target via…

Computation · Statistics 2017-10-10 L. Martino , F. Louzada

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling

The dominant approach to generating from language models subject to some constraint is locally constrained decoding (LCD), incrementally sampling tokens at each time step such that the constraint is never violated. Typically, this is…

Computation and Language · Computer Science 2025-08-19 Benjamin Lipkin , Benjamin LeBrun , Jacob Hoover Vigly , João Loula , David R. MacIver , Li Du , Jason Eisner , Ryan Cotterell , Vikash Mansinghka , Timothy J. O'Donnell , Alexander K. Lew , Tim Vieira

Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models

Speculative Decoding is a prominent technique for accelerating the autoregressive inference of large language models (LLMs) by employing a fast draft model to propose candidate token sequences and a large target model to verify them in…

Computation and Language · Computer Science 2025-12-18 Chendong Sun , Ali Mao , Lei Xu , mingmin Chen

Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective

Constrained decoding enables Language Models (LMs) to produce samples that provably satisfy hard constraints. However, existing constrained-decoding approaches often distort the underlying model distribution, a limitation that is especially…

Artificial Intelligence · Computer Science 2025-06-09 Emmanuel Anaya Gonzalez , Sairam Vaidya , Kanghee Park , Ruyi Ji , Taylor Berg-Kirkpatrick , Loris D'Antoni

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Aligning large language models (LLMs) with human preferences is essential for their applications. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that avoids fine-tuning model parameters. This approach…

Computation and Language · Computer Science 2025-08-05 Bolian Li , Yifan Wang , Anamika Lochab , Ananth Grama , Ruqi Zhang

Improved Constrained Generation by Bridging Pretrained Generative Models

Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical laws and safety-critical constraints. In real-world settings, these constraints rarely take…

Machine Learning · Computer Science 2026-03-10 Xiaoxuan Liang , Saeid Naderiparizi , Yunpeng Liu , Berend Zwartsenberg , Frank Wood

Cluster-based Adaptive Retrieval: Dynamic Context Selection for RAG Applications

Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by pulling in external material, document, code, manuals, from vast and ever-growing corpora, to effectively answer user queries. The effectiveness of RAG depends…

Information Retrieval · Computer Science 2025-11-20 Yifan Xu , Vipul Gupta , Rohit Aggarwal , Varsha Mahadevan , Bhaskar Krishnamachari

Pliable rejection sampling

Rejection sampling is a technique for sampling from difficult distributions. However, its use is limited due to a high rejection rate. Common adaptive rejection sampling methods either work only for very specific distributions or without…

Machine Learning · Statistics 2026-04-27 Akram Erraqabi , Michal Valko , Alexandra Carpentier , Odalric-Ambrym Maillard

Regenerative Rejection Sampling

This thesis presents Regenerative Rejection Sampling (RRS), a novel approximate sampling algorithm inspired by classical Rejection Sampling and Markov Chain Monte Carlo methods. The method constructs a continuous-time regenerative process…

Computation · Statistics 2026-04-01 Tommaso Bozzi

Two adaptive rejection sampling schemes for probability density functions log-convex tails

Monte Carlo methods are often necessary for the implementation of optimal Bayesian estimators. A fundamental technique that can be used to generate samples from virtually any target probability distribution is the so-called rejection…

Computation · Statistics 2011-11-22 Luca Martino , Joaquín Míguez

Resampled Priors for Variational Autoencoders

We propose Learned Accept/Reject Sampling (LARS), a method for constructing richer priors using rejection sampling with a learned acceptance function. This work is motivated by recent analyses of the VAE objective, which pointed out that…

Machine Learning · Statistics 2019-04-29 Matthias Bauer , Andriy Mnih

RADS: Reinforcement Learning-Based Sample Selection Improves Transfer Learning in Low-resource and Imbalanced Clinical Settings

A common strategy in transfer learning is few shot fine-tuning, but its success is highly dependent on the quality of samples selected as training examples. Active learning methods such as uncertainty sampling and diversity sampling can…

Computation and Language · Computer Science 2026-04-23 Wei Han , David Martinez , Anna Khanina , Lawrence Cavedon , Karin Verspoor

Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation

Despite significant progress, state-of-the-art abstractive summarization methods are still prone to hallucinate content inconsistent with the source document. In this paper, we propose Constrained Abstractive Summarization (CAS), a general…

Computation and Language · Computer Science 2021-12-17 Yuning Mao , Xiang Ren , Heng Ji , Jiawei Han

Grammar-Aligned Decoding

Large Language Models (LLMs) struggle with reliably generating highly structured outputs, such as program code, mathematical formulas, or well-formed markup. Constrained decoding approaches mitigate this problem by greedily restricting what…

Artificial Intelligence · Computer Science 2025-12-15 Kanghee Park , Jiayu Wang , Taylor Berg-Kirkpatrick , Nadia Polikarpova , Loris D'Antoni

Parsimonious Adaptive Rejection Sampling

Monte Carlo (MC) methods have become very popular in signal processing during the past decades. The adaptive rejection sampling (ARS) algorithms are well-known MC technique which draw efficiently independent samples from univariate target…

Computation · Statistics 2017-10-16 Luca Martino

Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling

Speculative sampling (SpS) has been successful in accelerating the decoding throughput of auto-regressive large language models by leveraging smaller draft models. SpS strictly enforces the generated distribution to match that of the…

Machine Learning · Computer Science 2026-04-08 Yongchang Hao , Lili Mou

Lossless Anti-Distillation Sampling

Frontier commercial generative models face a growing threat from distillation, whereby a distiller harvests generated responses and trains a competing model of its own at drastically lower cost. Existing defenses either rely on modifying…

Machine Learning · Computer Science 2026-05-20 Zibo Diao , Jingchu Gai , Xinyue Ai , Zhang Zhang , Zhenyu He , Di He

Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning

Recent advancements in reasoning have significantly enhanced the capabilities of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) across diverse tasks. However, excessive reliance on chain-of-thought (CoT) reasoning…

Computation and Language · Computer Science 2025-05-22 Jinghui Lu , Haiyang Yu , Siliang Xu , Shiwei Ran , Guozhi Tang , Siqi Wang , Bin Shan , Teng Fu , Hao Feng , Jingqun Tang , Han Wang , Can Huang

SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression

Retrieval-augmented Generation (RAG) extends large language models (LLMs) with external knowledge but faces key challenges: restricted effective context length and redundancy in retrieved documents. Pure compression-based approaches reduce…

Computation and Language · Computer Science 2025-07-09 Yiqiao Jin , Kartik Sharma , Vineeth Rakesh , Yingtong Dou , Menghai Pan , Mahashweta Das , Srijan Kumar

Context-Adaptive Synthesis and Compression for Enhanced Retrieval-Augmented Generation in Complex Domains

Large Language Models (LLMs) excel in language tasks but are prone to hallucinations and outdated knowledge. Retrieval-Augmented Generation (RAG) mitigates these by grounding LLMs in external knowledge. However, in complex domains involving…

Computation and Language · Computer Science 2025-08-28 Peiran Zhou , Junnan Zhu , Yichen Shen , Ruoxi Yu