Related papers: Parallel-Probe: Towards Efficient Parallel Thinkin…

DeepPrune: Parallel Scaling without Inter-trace Redundancy

Parallel scaling has emerged as a powerful paradigm to enhance reasoning capabilities in large language models (LLMs) by generating multiple Chain-of-Thought (CoT) traces simultaneously. However, this approach introduces significant…

Computation and Language · Computer Science 2026-04-17 Shangqing Tu , Yaxuan Li , Yushi Bai , Lei Hou , Juanzi Li

A Survey on Parallel Reasoning

With the increasing capabilities of Large Language Models (LLMs), parallel reasoning has emerged as a new inference paradigm that enhances reasoning robustness by concurrently exploring multiple lines of thought before converging on a final…

Computation and Language · Computer Science 2025-10-15 Ziqi Wang , Boye Niu , Zipeng Gao , Zhi Zheng , Tong Xu , Linghui Meng , Zhongli Li , Jing Liu , Yilong Chen , Chen Zhu , Hua Wu , Haifeng Wang , Enhong Chen

Dynamic Parallel Tree Search for Efficient LLM Reasoning

Tree of Thoughts (ToT) enhances Large Language Model (LLM) reasoning by structuring problem-solving as a spanning tree. However, recent methods focus on search accuracy while overlooking computational efficiency. The challenges of…

Artificial Intelligence · Computer Science 2025-02-28 Yifu Ding , Wentao Jiang , Shunyu Liu , Yongcheng Jing , Jinyang Guo , Yingjie Wang , Jing Zhang , Zengmao Wang , Ziwei Liu , Bo Du , Xianglong Liu , Dacheng Tao

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Modern language models generate chain-of-thought traces by autoregressively sampling tokens from a finite vocabulary. While this discrete sampling has achieved remarkable success, conducting chain-of-thought with continuously-valued tokens…

Machine Learning · Computer Science 2026-03-06 Halil Alperen Gozeten , M. Emrullah Ildiz , Xuechen Zhang , Hrayr Harutyunyan , Ankit Singh Rawat , Samet Oymak

Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling

Test-Time Scaling (TTS) enhances the reasoning capabilities of large language models by allocating additional inference compute to explore the solution space. However, existing parallel TTS methods typically keep branches isolated during…

Computation and Language · Computer Science 2026-05-27 Xinglin Wang , Hao Lin , Shaoxiong Feng , Peiwen Yuan , Yiwei Li , Jiayi Shi , Yueqi Zhang , Chuyi Tan , Ji Zhang , Boyuan Pan , Yao Hu , Kan Li

Parallel Latent Reasoning for Sequential Recommendation

Capturing complex user preferences from sparse behavioral sequences remains a fundamental challenge in sequential recommendation. Recent latent reasoning methods have shown promise by extending test-time computation through multi-step…

Information Retrieval · Computer Science 2026-01-07 Jiakai Tang , Xu Chen , Wen Chen , Jian Wu , Yuning Jiang , Bo Zheng

Scaling Reasoning Tokens via RL and Parallel Thinking: Evidence From Competitive Programming

We study how to scale reasoning token budgets for competitive programming through two complementary approaches: training-time reinforcement learning (RL) and test-time parallel thinking. During RL training, we observe an approximately…

Computation and Language · Computer Science 2026-04-03 Qianfan Zhang , Tianyu Guo , Xuandi Ren , Jiale Chen , Ming Ding , Ran Xin , Xia Xiao

W&D:Scaling Parallel Tool Calling for Efficient Deep Research Agents

Deep research agents have emerged as powerful tools for automating complex intellectual tasks through multi-step reasoning and web-based information seeking. While recent efforts have successfully enhanced these agents by scaling depth…

Artificial Intelligence · Computer Science 2026-02-10 Xiaoqiang Lin , Jun Hao Liew , Silvio Savarese , Junnan Li

ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Recent advances in Large Language Models (LLMs) have been driven by test-time compute scaling - a strategy that improves reasoning by generating longer, sequential thought processes. While effective, this approach encounters a significant…

Computation and Language · Computer Science 2025-09-08 Hao Wen , Yifan Su , Feifei Zhang , Yunxin Liu , Yunhao Liu , Ya-Qin Zhang , Yuanchun Li

ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking

Parallel thinking expands exploration breadth, complementing the deep exploration of information-seeking (IS) agents to further enhance problem-solving capability. However, conventional parallel thinking faces two key challenges in this…

Computation and Language · Computer Science 2025-10-29 Baixuan Li , Dingchu Zhang , Jialong Wu , Wenbiao Yin , Zhengwei Tao , Yida Zhao , Liwen Zhang , Haiyang Shen , Runnan Fang , Pengjun Xie , Jingren Zhou , Yong Jiang

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

Parallel reasoning enhances Large Reasoning Models (LRMs) but incurs prohibitive costs due to futile paths caused by early errors. To mitigate this, path pruning at the prefix level is essential, yet existing research remains fragmented…

Computation and Language · Computer Science 2026-04-20 Jiaxi Bi , Tongxu Luo , Wenyu Du , Zhengyang Tang , Benyou Wang

Rethinking Thinking Tokens: LLMs as Improvement Operators

Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which among other things, allows them to explore solution strategies with self-checking. This results in higher accuracy, but inflates context length,…

Machine Learning · Computer Science 2025-10-02 Lovish Madaan , Aniket Didolkar , Suchin Gururangan , John Quan , Ruan Silva , Ruslan Salakhutdinov , Manzil Zaheer , Sanjeev Arora , Anirudh Goyal

Efficient Tree-Structured Deep Research with Adaptive Resource Allocation

Deep research agents, which synthesize information across diverse sources, are significantly constrained by the sequential nature of reasoning. This bottleneck results in high latency, poor runtime adaptability, and inefficient resource…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-31 Lunyiu Nie , Nedim Lipka , Ryan A. Rossi , Swarat Chaudhuri

Sampling Parallelism for Fast and Efficient Bayesian Learning

Machine learning models, and deep neural networks in particular, are increasingly deployed in risk-sensitive domains such as healthcare, environmental forecasting, and finance, where reliable quantification of predictive uncertainty is…

Machine Learning · Computer Science 2026-04-07 Asena Karolin Özdemir , Lars H. Heyen , Arvid Weyrauch , Achim Streit , Markus Götz , Charlotte Debus

The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute

We revisit test-time scaling for language model reasoning and ask a fundamental question: at equal token budget and compute, is it better to run multiple independent chains in parallel, or to run fewer chains that iteratively refine through…

Machine Learning · Computer Science 2025-11-05 Aman Sharma , Paras Chopra

1-D and 2-D Parallel Algorithms for All-Pairs Similarity Problem

All-pairs similarity problem asks to find all vector pairs in a set of vectors the similarities of which surpass a given similarity threshold, and it is a computational kernel in data mining and information retrieval for several tasks. We…

Information Retrieval · Computer Science 2014-02-14 Eray Özkural , Cevdet Aykanat

Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence

Recent advances in reasoning models have demonstrated significant improvements in accuracy by employing detailed and comprehensive reasoning processes. However, generating these lengthy reasoning sequences is computationally expensive and…

Computation and Language · Computer Science 2025-08-27 Yijiong Yu

Parallelizing Query Optimization on Shared-Nothing Architectures

Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query…

Databases · Computer Science 2015-11-06 Immanuel Trummer , Christoph Koch

Parallelisation of a Common Changepoint Detection Method

In recent years, various means of efficiently detecting changepoints in the univariate setting have been proposed, with one popular approach involving minimising a penalised cost function using dynamic programming. In some situations, these…

Methodology · Statistics 2018-10-09 S. O. Tickle , I. A. Eckley , P. Fearnhead , K. Haynes

Think in Parallel, Answer as One: Logit Averaging for Open-Ended Reasoning

Majority voting has proven effective for close-ended question answering by aggregating parallel reasoning traces. However, it is not directly applicable to open-ended reasoning, such as code generation and web-based deep research, where a…

Computation and Language · Computer Science 2025-12-03 Haonan Wang , Chao Du , Kenji Kawaguchi , Tianyu Pang