Related papers: Programming over Thinking: Efficient and Robust Mu…

Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning

Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…

Computation and Language · Computer Science 2024-10-07 Jiaxin Wen , Jian Guan , Hongning Wang , Wei Wu , Minlie Huang

Efficient LLM Collaboration via Planning

Recently, large language models (LLMs) have demonstrated strong performance, ranging from simple to complex tasks. However, while large models achieve remarkable results across diverse tasks, they often incur substantial monetary inference…

Artificial Intelligence · Computer Science 2026-05-12 Byeongchan Lee , Jonghoon Lee , Dongyoung Kim , Jaehyung Kim , Kyungjoon Park , Dongjun Lee , Jinwoo Shin

Planning In Natural Language Improves LLM Search For Code Generation

While scaling training compute has led to remarkable improvements in large language models (LLMs), scaling inference compute has not yet yielded analogous gains. We hypothesize that a core missing component is a lack of diverse LLM outputs,…

Machine Learning · Computer Science 2024-10-22 Evan Wang , Federico Cassano , Catherine Wu , Yunfeng Bai , Will Song , Vaskar Nath , Ziwen Han , Sean Hendryx , Summer Yue , Hugh Zhang

Constraint-aware Path Planning from Natural Language Instructions Using Large Language Models

Real-world path planning tasks typically involve multiple constraints beyond simple route optimization, such as the number of routes, maximum route length, depot locations, and task-specific requirements. Traditional approaches rely on…

Computation and Language · Computer Science 2026-03-23 Dylan Shim , Minghan Wei

SCOPE: Language Models as One-Time Teacher for Hierarchical Planning in Text Environments

Long-term planning in complex, text-based environments presents significant challenges due to open-ended action spaces, ambiguous observations, and sparse feedback. Recent research suggests that large language models (LLMs) encode rich…

Artificial Intelligence · Computer Science 2025-12-11 Haoye Lu , Pavan Seshadri , Kaheer Suleman

Interactive and Expressive Code-Augmented Planning with Large Language Models

Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making, but often struggle with complex, long-horizon planning tasks. Recent techniques have sought to structure LLM outputs using…

Computation and Language · Computer Science 2024-11-22 Anthony Z. Liu , Xinhe Wang , Jacob Sansom , Yao Fu , Jongwook Choi , Sungryull Sohn , Jaekyeom Kim , Honglak Lee

LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning

Although planning is a crucial component of the autonomous driving stack, researchers have yet to develop robust planning algorithms that are capable of safely handling the diverse range of possible driving scenarios. Learning-based…

Artificial Intelligence · Computer Science 2024-01-02 S P Sharan , Francesco Pittaluga , Vijay Kumar B G , Manmohan Chandraker

Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning

There emerges a promising trend of using large language models (LLMs) to generate code-like plans for complex inference tasks such as visual reasoning. This paradigm, known as LLM-based planning, provides flexibility in problem solving and…

Computation and Language · Computer Science 2023-08-22 Pengbo Hu , Ji Qi , Xingyu Li , Hong Li , Xinqi Wang , Bing Quan , Ruiyu Wang , Yi Zhou

Self-planning Code Generation with Large Language Models

Although large language models (LLMs) have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning…

Software Engineering · Computer Science 2025-10-21 Xue Jiang , Yihong Dong , Lecheng Wang , Zheng Fang , Qiwei Shang , Ge Li , Zhi Jin , Wenpin Jiao

CRISP: Complex Reasoning with Interpretable Step-based Plans

Recent advancements in large language models (LLMs) underscore the need for stronger reasoning capabilities to solve complex problems effectively. While Chain-of-Thought (CoT) reasoning has been a step forward, it remains insufficient for…

Computation and Language · Computer Science 2025-07-14 Matan Vetzler , Koren Lazar , Guy Uziel , Eran Hirsch , Ateret Anaby-Tavor , Leshem Choshen

SCALE: Selective Resource Allocation for Overcoming Performance Bottlenecks in Mathematical Test-time Scaling

Test-time compute scaling has emerged as a powerful paradigm for enhancing mathematical reasoning in large language models (LLMs) by allocating additional computational resources during inference. However, current methods employ uniform…

Computation and Language · Computer Science 2025-12-02 Yang Xiao , Chunpu Xu , Ruifeng Yuan , Jiashuo Wang , Wenjie Li , Pengfei Liu

Not All Code Is Equal: A Data-Centric Study of Code Complexity and LLM Reasoning

Large Language Models (LLMs) increasingly exhibit strong reasoning abilities, often attributed to their capacity to generate chain-of-thought-style intermediate reasoning. Recent work suggests that exposure to code can further enhance these…

Machine Learning · Computer Science 2026-01-30 Lukas Twist , Shu Yang , Hanqi Yan , Jingzhi Gong , Di Wang , Helen Yannakoudakis , Jie M. Zhang

Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

Chain-of-Thought (CoT) empowers Large Language Models (LLMs) to tackle complex problems, but remains constrained by the computational cost and reasoning path collapse when grounded in discrete token spaces. Recent latent reasoning…

Artificial Intelligence · Computer Science 2026-02-05 Jiecong Wang , Hao Peng , Chunyang Liu

Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation

Multi-step reasoning ability of large language models is crucial in tasks such as math and tool utilization. Current researches predominantly focus on enhancing model performance in these multi-step reasoning tasks through fine-tuning with…

Computation and Language · Computer Science 2024-10-23 Yuli Qiu , Jiashu Yao , Heyan Huang , Yuhang Guo

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

Recent studies have highlighted their proficiency in some simple tasks like writing and coding through various reasoning strategies. However, LLM agents still struggle with tasks that require comprehensive planning, a process that…

Artificial Intelligence · Computer Science 2024-05-29 Chengxing Xie , Difan Zou

Mil-SCORE: Benchmarking Long-Context Geospatial Reasoning and Planning in Large Language Models

As large language models (LLMs) are applied to increasingly longer and more complex tasks, there is a growing need for realistic long-context benchmarks that require selective reading and integration of heterogeneous, multi-modal…

Computation and Language · Computer Science 2026-02-06 Aadi Palnitkar , Mingyang Mao , Nicholas Waytowich , Vinicius G. Goecks , Xiaomin Lin

REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving

While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven…

Machine Learning · Computer Science 2026-02-05 Annabelle Sujun Tang , Christopher Priebe , Rohan Mahapatra , Lianhui Qin , Hadi Esmaeilzadeh

Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate…

Software Engineering · Computer Science 2024-03-21 Zhihong Sun , Chen Lyu , Bolun Li , Yao Wan , Hongyu Zhang , Ge Li , Zhi Jin

CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation

Large Language Models (LLMs) have demonstrated remarkable performance on assisting humans in programming and facilitating programming automation. However, existing benchmarks for evaluating the code understanding and generation capacities…

Computation and Language · Computer Science 2024-06-10 Weixiang Yan , Haitian Liu , Yunkun Wang , Yunzhe Li , Qian Chen , Wen Wang , Tingyu Lin , Weishan Zhao , Li Zhu , Hari Sundaram , Shuiguang Deng

Dual-Track CoT: Budget-Aware Stepwise Guidance for Small LMs

Large Language Models (LLMs) solve many reasoning tasks via chain-of-thought (CoT) prompting, but smaller models (about 7 to 8B parameters) still struggle with multi-step reasoning under tight compute and token budgets. Existing test time…

Computation and Language · Computer Science 2026-04-29 Sagnik Chatterjee , Atharva Patil , Sricharan Ramesh