English
Related papers

Related papers: Code Simulation Challenges for Large Language Mode…

200 papers

Many reasoning, planning, and problem-solving tasks share an intrinsic algorithmic nature: correctly simulating each step is a sufficient condition to solve them correctly. We collect pairs of naturalistic and synthetic reasoning tasks to…

Code provides a general syntactic structure to build complex programs and perform precise computations when paired with a code interpreter - we hypothesize that language models (LMs) can leverage code-writing to improve Chain of Thought…

Computation and Language · Computer Science 2024-07-31 Chengshu Li , Jacky Liang , Andy Zeng , Xinyun Chen , Karol Hausman , Dorsa Sadigh , Sergey Levine , Li Fei-Fei , Fei Xia , Brian Ichter

Transformer-based large language models (LLMs) have demonstrated significant potential in addressing logic problems. capitalizing on the great capabilities of LLMs for code-related activities, several frameworks leveraging logical solvers…

Artificial Intelligence · Computer Science 2024-03-29 Minyu Chen , Guoqiang Li , Ling-I Wu , Ruibang Liu , Yuxin Su , Xi Chang , Jianxin Xue

Large Language Models (LLMs) often struggle with complex reasoning tasks due to insufficient in-depth insights in their training data, which are typically absent in publicly available documents. This paper introduces the Chain of…

Computation and Language · Computer Science 2025-06-10 Cong Liu , Jie Wu , Weigang Wu , Xu Chen , Liang Lin , Wei-Shi Zheng

Large Language Models (LLMs) have achieved remarkable success in tasks requiring complex reasoning, such as code generation, mathematical problem solving, and algorithmic synthesis -- especially when aided by reasoning tokens and…

Computation and Language · Computer Science 2025-06-13 Jaechul Roh , Varun Gandhi , Shivani Anilkumar , Arin Garg

Training large language models (LLMs) with chain-of-thought (CoT) supervision has proven effective for enhancing their reasoning abilities. However, obtaining reliable and accurate reasoning supervision remains a significant challenge. We…

Computation and Language · Computer Science 2025-10-21 Dongwon Jung , Wenxuan Zhou , Muhao Chen

Large Language Models (LLMs) have demonstrated strong capabilities in natural language understanding and reasoning. However, their ability to perform exact, deterministic computation remains unclear. In this work, we systematically evaluate…

Artificial Intelligence · Computer Science 2026-05-08 Hongkun Yu

Large Language Models (LLMs) have been widely used to automate programming tasks. Their capabilities have been evaluated by assessing the quality of generated code through tests or proofs. The extent to which they can reason about code is a…

Software Engineering · Computer Science 2026-04-08 Changshu Liu , Yang Chen , Reyhaneh Jabbarvand

Recent studies have discovered that Chain-of-Thought prompting (CoT) can dramatically improve the performance of Large Language Models (LLMs), particularly when dealing with complex tasks involving mathematics or reasoning. Despite the…

Machine Learning · Computer Science 2023-12-27 Guhao Feng , Bohang Zhang , Yuntian Gu , Haotian Ye , Di He , Liwei Wang

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…

Software Engineering · Computer Science 2024-03-07 Chongzhou Fang , Ning Miao , Shaurya Srivastav , Jialin Liu , Ruoyu Zhang , Ruijie Fang , Asmita , Ryan Tsang , Najmeh Nazari , Han Wang , Houman Homayoun

Large Language Models (LLMs) have demonstrated remarkable potential in code generation. The integration of Chain of Thought (CoT) reasoning can further boost their performance. However, current CoT methods often require manual writing or…

Software Engineering · Computer Science 2024-08-06 Guang Yang , Yu Zhou , Xiang Chen , Xiangyu Zhang , Terry Yue Zhuo , Taolue Chen

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate…

Software Engineering · Computer Science 2024-03-21 Zhihong Sun , Chen Lyu , Bolun Li , Yao Wan , Hongyu Zhang , Ge Li , Zhi Jin

Code reasoning tasks are becoming prevalent in large language model (LLM) assessments. Yet, there is a dearth of studies on the impact of real-world complexities on code reasoning, e.g., inter- or intra-procedural dependencies, API calls,…

Software Engineering · Computer Science 2026-04-27 Changshu Liu , Alireza Ghazanfari , Yang Chen , Reyhaneh Jabbarvand

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin

Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution. Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution…

Artificial Intelligence · Computer Science 2025-03-13 Kaya Stechly , Karthik Valmeekam , Subbarao Kambhampati

Large language models (LLMs) have scaled up to unlock a wide range of complex reasoning tasks with the aid of various prompting methods. However, current prompting methods generate natural language intermediate steps to help reasoning,…

Computation and Language · Computer Science 2023-10-10 Yi Hu , Haotong Yang , Zhouchen Lin , Muhan Zhang

Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…

Computation and Language · Computer Science 2024-10-07 Jiaxin Wen , Jian Guan , Hongning Wang , Wei Wu , Minlie Huang

Chain-of-thought emerges as a promising technique for eliciting reasoning capabilities from Large Language Models (LLMs). However, it does not always improve task performance or accurately represent reasoning processes, leaving unresolved…

Computation and Language · Computer Science 2024-12-13 Guangsheng Bao , Hongbo Zhang , Cunxiang Wang , Linyi Yang , Yue Zhang

While large language models (LLMs) have demonstrated remarkable reasoning capabilities, they often struggle with complex tasks that require specific thinking paradigms, such as divide-and-conquer and procedural deduction, \etc Previous…

Software Engineering · Computer Science 2025-06-05 Kechi Zhang , Ge Li , Jia Li , Huangzhao Zhang , Jingjing Xu , Hao Zhu , Lecheng Wang , Jia Li , Yihong Dong , Jing Mai , Bin Gu , Zhi Jin

Security vulnerabilities are increasingly prevalent in modern software and they are widely consequential to our society. Various approaches to defending against these vulnerabilities have been proposed, among which those leveraging deep…

Cryptography and Security · Computer Science 2024-02-28 Yu Nong , Mohammed Aldeen , Long Cheng , Hongxin Hu , Feng Chen , Haipeng Cai
‹ Prev 1 2 3 10 Next ›