English
Related papers

Related papers: Learning to Decode Collaboratively with Multiple L…

200 papers

The computational complexity of large language model (LLM) inference significantly constrains their deployment efficiency on edge devices. In contrast, small language models offer faster decoding and lower resource consumption but often…

Computation and Language · Computer Science 2025-04-11 Jianshu She , Wenhao Zheng , Zhengzhong Liu , Hongyi Wang , Eric Xing , Huaxiu Yao , Qirong Ho

One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during…

Computation and Language · Computer Science 2024-11-21 Sean Welleck , Amanda Bertsch , Matthew Finlayson , Hailey Schoelkopf , Alex Xie , Graham Neubig , Ilia Kulikov , Zaid Harchaoui

How can small-scale large language models (LLMs) efficiently utilize the supervision of LLMs to improve their generative quality? This question has been well studied in scenarios where there is no restriction on the number of LLM…

Computation and Language · Computer Science 2024-10-04 Hyunjong Ok , Jegwang Ryu , Jaeho Lee

Large Language Models (LLMs) for complex reasoning is often hindered by high computational costs and latency, while resource-efficient Small Language Models (SLMs) typically lack the necessary reasoning capacity. Existing collaborative…

Computation and Language · Computer Science 2026-01-09 Chengsong Huang , Tong Zheng , Langlin Huang , Jinyuan Li , Haolin Liu , Jiaxin Huang

The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model. Finetuned and aligned models…

Computation and Language · Computer Science 2024-02-29 Lifeng Jin , Baolin Peng , Linfeng Song , Haitao Mi , Ye Tian , Dong Yu

Large Language Models (LLMs) often excel in specific domains but fall short in others due to the limitations of their training. Thus, enabling LLMs to solve problems collaboratively by integrating their complementary knowledge promises to…

Computation and Language · Computer Science 2025-03-20 Ziyao Wang , Muneeza Azmat , Ang Li , Raya Horesh , Mikhail Yurochkin

Large Language Models (LLMs) exhibit impressive capabilities across various applications but encounter substantial challenges such as high inference latency, considerable training costs, and the generation of hallucinations. Collaborative…

Computation and Language · Computer Science 2024-10-24 Kaiyan Zhang , Jianyu Wang , Ning Ding , Biqing Qi , Ermo Hua , Xingtai Lv , Bowen Zhou

Large language models (LLMs) offer strong capabilities but raise cost and privacy concerns, whereas small language models (SLMs) facilitate efficient and private local inference yet suffer from limited capacity. To synergize the…

Computation and Language · Computer Science 2026-04-21 Hang Zeng , Xiangyu Liu , Yong Hu , Chaoyue Niu , Jiarui Zhang , Shaojie Tang , Fan Wu , Guihai Chen

To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference. In each decoding step, this method first drafts…

Computation and Language · Computer Science 2024-06-05 Heming Xia , Zhe Yang , Qingxiu Dong , Peiyi Wang , Yongqi Li , Tao Ge , Tianyu Liu , Wenjie Li , Zhifang Sui

Large models have achieved remarkable performance across a range of reasoning and understanding tasks. Prior work often utilizes model ensembles or multi-agent systems to collaboratively generate responses, effectively operating in a…

Machine Learning · Computer Science 2025-11-11 Siqi Huang , Sida Huang , Hongyuan Zhang

Multilingual Large Language Models (LLMs) can process many languages, yet how they internally represent this diversity remains unclear. Do they form shared multilingual representations with language-specific decoding, and if so, why does…

Computation and Language · Computer Science 2026-02-10 Abir Harrasse , Florent Draye , Punya Syon Pandey , Zhijing Jin , Bernhard Schölkopf

Large Language Models (LLMs) exhibit impressive performance on a range of NLP tasks, due to the general-purpose linguistic knowledge acquired during pretraining. Existing model interpretability research (Tenney et al., 2019) suggests that a…

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely…

Computation and Language · Computer Science 2024-08-08 Xinyi Wang , Lucas Caccia , Oleksiy Ostapenko , Xingdi Yuan , William Yang Wang , Alessandro Sordoni

A flurry of recent work has demonstrated that pre-trained large language models (LLMs) can be effective task planners for a variety of single-robot tasks. The planning performance of LLMs is significantly improved via prompting techniques,…

Robotics · Computer Science 2024-03-25 Yongchao Chen , Jacob Arkin , Yang Zhang , Nicholas Roy , Chuchu Fan

Recent studies show that LLMs possess different skills and specialize in different tasks. In fact, we observe that their varied performance occur in several levels of granularity. For example, in the code optimization task, code LLMs excel…

Artificial Intelligence · Computer Science 2025-10-24 Yuanzhe Liu , Ryan Deng , Tim Kaler , Xuhao Chen , Charles E. Leiserson , Yao Ma , Jie Chen

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent…

Computation and Language · Computer Science 2026-05-11 Viacheslav Meshchaninov , Alexander Shabalin , Egor Chimbulatov , Nikita Gushchin , Ilya Koziev , Alexander Korotin , Dmitry Vetrov

Autoregressive decoding in large language models (LLMs) requires $\mathcal{O}(n)$ sequential steps for $n$ tokens, fundamentally limiting inference throughput. Recent diffusion-based LLMs (dLLMs) enable parallel token generation through…

Computation and Language · Computer Science 2025-10-06 Wenrui Bao , Zhiben Chen , Dan Xu , Yuzhang Shang

The use of large language models (LLMs) for automated code generation has emerged as a significant focus within AI research. As these pretrained models continue to evolve, their ability to understand and generate complex code structures has…

Software Engineering · Computer Science 2025-05-06 Nazmus Ashrafi , Salah Bouktif , Mohammed Mediani

Large language models (LLMs) have become ubiquitous in practice and are widely used for generation tasks such as translation, summarization and instruction following. However, their enormous size and reliance on autoregressive decoding…

Machine Learning · Computer Science 2024-07-18 Benjamin Bergner , Andrii Skliar , Amelie Royer , Tijmen Blankevoort , Yuki Asano , Babak Ehteshami Bejnordi

Large Language Model (LLM) collaborative decoding techniques improve output quality by combining the outputs of multiple models at each generation step, but they incur high computational costs. In this paper, we introduce Collaborative…

Computation and Language · Computer Science 2025-05-30 Jiale Fu , Yuchu Jiang , Junkai Chen , Jiaming Fan , Xin Geng , Xu Yang
‹ Prev 1 2 3 10 Next ›