Computation and Language · Computer Science
Improving Multi-candidate Speculative Decoding
Xiaofan Lu, Yixiao Zeng, Feiyang Ma, Zixu Yu +1
2024-12-17
Computer Vision and Pattern Recognition · Computer Science
Speculative Decoding Reimagined for Multimodal Large Language Models
Luxi Lin, Zhihang Lin, Zhanpeng Zeng, Rongrong Ji
2025-05-21
Computation and Language · Computer Science
S2D: Sorted Speculative Decoding For More Efficient Deployment of Nested Large Language Models
Parsa Kavehzadeh, Mohammadreza Pourreza, Mojtaba Valipour, Tinashu Zhu +4
2024-07-03
Computation and Language · Computer Science
Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies
Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky +4
2025-06-12
Machine Learning · Computer Science
Faster LLM Inference via Sequential Monte Carlo
Yahya Emara, Mauricio Barba da Costa, Chi-Chih Chang, Cameron Freer +3
2026-04-20
Computation and Language · Computer Science
3-Model Speculative Decoding
Sanghyun Byun, Mohanad Odema, Jung Ick Guack, Baisub Lee +2
2025-10-16
Computation and Language · Computer Science
Beyond the Target: From Imitation to Collaboration in Speculative Decoding
Jinze Li, Yixing Xu, Guanchen Li, Jinfeng Xu +6
2026-05-26
Computation and Language · Computer Science
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters
Euiin Yi, Taehyeon Kim, Hongseok Jeung, Du-Seong Chang +1
2024-11-12
Distributed, Parallel, and Cluster Computing · Computer Science
SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism
Yuhao Shen, Junyi Shen, Quan Kong, Tianyu Liu +2
2026-04-15
Computation and Language · Computer Science
Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding
Sungkyun Kim, Jaemin Kim, Dogyung Yoon, Jiho Shin +2
2026-04-21
Artificial Intelligence · Computer Science
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inference
Zongyue Qin, Zifan He, Neha Prakriya, Jason Cong +1
2025-03-17
Computation and Language · Computer Science
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao, Hongming Zhang, Tao Ge, Siru Ouyang +2
2024-10-10
Computation and Language · Computer Science
Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
Heming Xia, Cunxiao Du, Yongqi Li, Qian Liu +1
2025-03-04
Machine Learning · Computer Science
Fast Inference via Hierarchical Speculative Decoding
Clara Mohri, Haim Kaplan, Tal Schuster, Yishay Mansour +1
2025-10-24
Computation and Language · Computer Science
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Heming Xia, Zhe Yang, Qingxiu Dong, Peiyi Wang +5
2024-06-05
Computation and Language · Computer Science
AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference
Kuan-Wei Lu, Ding-Yong Hong, Pangfeng Liu, Jan-Jan Wu
2026-05-27
Computation and Language · Computer Science
Speculative Decoding: Performance or Illusion?
Xiaoxuan Liu, Jiaxiang Yu, Jongseok Park, Ion Stoica +1
2026-03-19
Computation and Language · Computer Science
SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Ryan Sun, Tianyi Zhou, Xun Chen, Lichao Sun
2024-11-11
Computation and Language · Computer Science
Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios
Luohe Shi, Zuchao Li, Lefei Zhang, Baoyuan Qi +2
2025-11-26
Computation and Language · Computer Science
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Danying Ge, Jianhua Gao, Qizhi Jiang, Yifei Feng +1
2025-05-14
Computation and Language · Computer Science
Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
Jiebin Zhang, Zhenghan Yu, Liang Wang, Nan Yang +7
2026-03-03
Computation and Language · Computer Science
MineDraft: A Framework for Batch Parallel Speculative Decoding
Zhenwei Tang, Arun Verma, Zijian Zhou, Zhaoxuan Wu +3
2026-03-20