Related papers: Assessing Code Generation with Intermediate Langua…

NL in the Middle: Code Translation with LLMs and Intermediate Representations

Studies show that large language models (LLMs) produce buggy code translations. One promising avenue to improve translation accuracy is through intermediate representations, which provide structured guidance for the translation process. We…

Software Engineering · Computer Science 2025-09-18 Chi-en Amy Tai , Pengyu Nie , Lukasz Golab , Alexander Wong

Understanding Chain-of-Thought Effectiveness in Code Generation: An Empirical and Information-Theoretic Analysis

Large language models (LLMs) achieve strong performance on code generation, but the mechanisms by which Chain-of-Thought (CoT) prompting helps remain unclear. We present a systematic empirical and information-theoretic study of CoT…

Software Engineering · Computer Science 2025-12-11 Naizhu Jin , Zhong Li , Guang Yang , Tian Zhang , Qingkai Zeng

Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation

Large language models (LLMs) have demonstrated impressive performance in code generation, particularly when augmented with chain-of-thought (CoT) prompting techniques. They break down requirements into intermediate reasoning steps, which…

Software Engineering · Computer Science 2025-07-10 Binquan Zhang , Li Zhang , Zhiwen Luo , Yuxin Du , Fang Liu , Song Wang , Lin Shi

Chain-of-Thought in Neural Code Generation: From and For Lightweight Language Models

Large Language Models (LLMs) have demonstrated remarkable potential in code generation. The integration of Chain of Thought (CoT) reasoning can further boost their performance. However, current CoT methods often require manual writing or…

Software Engineering · Computer Science 2024-08-06 Guang Yang , Yu Zhou , Xiang Chen , Xiangyu Zhang , Terry Yue Zhuo , Taolue Chen

LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

Large reasoning models (LRMs) have led to new possibilities in terms of problem-solving, through the devising of a natural language thought process prior to answering a query. While their capabilities are well known across mathematics and…

Computation and Language · Computer Science 2025-10-15 Armel Zebaze , Rachel Bawden , Benoît Sagot

Beyond Words: A Latent Memory Approach to Internal Reasoning in LLMs

Recent advances in large language models (LLMs) have popularized the chain-of-thought (CoT) paradigm, in which models produce explicit reasoning steps in natural language. Although this approach improves interpretability and facilitates…

Computation and Language · Computer Science 2025-03-03 José I. Orlicki

Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation

Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks due to their versatility and ability to produce high-quality results. Specifically, they are increasingly used for automatic code…

Artificial Intelligence · Computer Science 2024-08-30 Jessica López Espejel , Mahaman Sanoussi Yahaya Alassan , Merieme Bouhandi , Walid Dahhane , El Hassane Ettifouri

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Large Language Models (LLMs) have shown impressive performance on complex tasks through Chain-of-Thought (CoT) reasoning. However, conventional CoT relies on explicitly verbalized intermediate steps, which constrains its broader…

Computation and Language · Computer Science 2025-11-04 Xinghao Chen , Anhao Zhao , Heming Xia , Xuan Lu , Hanlin Wang , Yanjun Chen , Wei Zhang , Jian Wang , Wenjie Li , Xiaoyu Shen

How Do Humans Write Code? Large Models Do It the Same Way Too

Program-of-Thought (PoT) replaces natural language-based Chain-of-Thought (CoT) as the most popular method in Large Language Models (LLMs) mathematical reasoning tasks by utilizing external tool calls to circumvent computational errors.…

Artificial Intelligence · Computer Science 2025-09-10 Long Li , Xuzheng He , Haozhe Wang , Linlin Wang , Liang He

UniCoder: Scaling Code Large Language Model via Universal Code

Intermediate reasoning or acting steps have successfully improved large language models (LLMs) for handling various downstream natural language processing (NLP) tasks. When applying LLMs for code generation, recent works mainly focus on…

Computation and Language · Computer Science 2024-06-25 Tao Sun , Linzheng Chai , Jian Yang , Yuwei Yin , Hongcheng Guo , Jiaheng Liu , Bing Wang , Liqun Yang , Zhoujun Li

Structured Chain-of-Thought Prompting for Code Generation

Large Language Models (LLMs) (e.g., ChatGPT) have shown impressive performance in code generation. LLMs take prompts as inputs, and Chain-of-Thought (CoT) prompting is the state-of-the-art prompting technique. CoT prompting asks LLMs first…

Software Engineering · Computer Science 2023-09-08 Jia Li , Ge Li , Yongmin Li , Zhi Jin

Enhancing Code Generation for Low-Resource Languages: No Silver Bullet

The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource…

Software Engineering · Computer Science 2025-02-03 Alessandro Giagnorio , Alberto Martin-Lopez , Gabriele Bavota

Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models

Large language models (LLMs) have scaled up to unlock a wide range of complex reasoning tasks with the aid of various prompting methods. However, current prompting methods generate natural language intermediate steps to help reasoning,…

Computation and Language · Computer Science 2023-10-10 Yi Hu , Haotong Yang , Zhouchen Lin , Muhan Zhang

An Empirical Study of Reasoning Steps in Thinking Code LLMs

Thinking Large Language Models (LLMs) generate explicit intermediate reasoning traces before final answers, potentially improving transparency, interpretability, and solution accuracy for code generation. However, the quality of these…

Artificial Intelligence · Computer Science 2025-11-11 Haoran Xue , Gias Uddin , Song Wang

Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought

To improve the ability of the large language model (LLMs) to tackle complex reasoning problems, chain-of-thoughts (CoT) methods were proposed to guide LLMs to reason step-by-step, enabling problem solving from simple to complex.…

Machine Learning · Computer Science 2024-06-27 Zhen-Yu Zhang , Siwei Han , Huaxiu Yao , Gang Niu , Masashi Sugiyama

Guiding Language Model Reasoning with Planning Tokens

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely…

Computation and Language · Computer Science 2024-08-08 Xinyi Wang , Lucas Caccia , Oleksiy Ostapenko , Xingdi Yuan , William Yang Wang , Alessandro Sordoni

Multimodal Chain-of-Thought Reasoning in Language Models

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer. However, existing CoT studies…

Computation and Language · Computer Science 2024-05-21 Zhuosheng Zhang , Aston Zhang , Mu Li , Hai Zhao , George Karypis , Alex Smola

Self-planning Code Generation with Large Language Models

Although large language models (LLMs) have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning…

Software Engineering · Computer Science 2025-10-21 Xue Jiang , Yihong Dong , Lecheng Wang , Zheng Fang , Qiwei Shang , Ge Li , Zhi Jin , Wenpin Jiao

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Prompting techniques such as chain-of-thought have established themselves as a popular vehicle for improving the outputs of large language models (LLMs). For code generation, however, their exact mechanics and efficacy are under-explored.…

Computation and Language · Computer Science 2025-04-09 Kunhao Zheng , Juliette Decugis , Jonas Gehring , Taco Cohen , Benjamin Negrevergne , Gabriel Synnaeve

Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting

Large Language Models, such as Generative Pre-trained Transformer 3 (aka. GPT-3), have been developed to understand language through the analysis of extensive text data, allowing them to identify patterns and connections between words.…

Computation and Language · Computer Science 2023-10-03 Baphumelele Masikisiki , Vukosi Marivate , Yvette Hlope