Related papers: SelfPiCo: Self-Guided Partial Code Execution with …

LExecutor: Learning-Guided Execution

Executing code is essential for various program analysis tasks, e.g., to detect bugs that manifest through exceptions or to obtain execution traces for further dynamic analysis. However, executing an arbitrary piece of code is often…

Software Engineering · Computer Science 2023-11-13 Beatriz Souza , Michael Pradel

StackPilot: Autonomous Function Agents for Scalable and Environment-Free Code Execution

Recent advances in large language models (LLMs) have substantially enhanced automated code generation across a wide range of programming languages. Nonetheless, verifying the correctness and executability of LLM-generated code remains a…

Programming Languages · Computer Science 2026-01-14 Xinkui Zhao , Yifan Zhang , Zhengyi Zhou , Yueshen Xu

Gistify! Codebase-Level Understanding via Runtime Execution

As coding agents are increasingly deployed in large codebases, the need to automatically design challenging, codebase-level evaluation is central. We propose Gistify, a task where a coding LLM must create a single, minimal, self-contained…

Computation and Language · Computer Science 2025-10-31 Hyunji Lee , Minseon Kim , Chinmay Singh , Matheus Pereira , Atharv Sonwane , Isadora White , Elias Stengel-Eskin , Mohit Bansal , Zhengyan Shi , Alessandro Sordoni , Marc-Alexandre Côté , Xingdi Yuan , Lucas Caccia

Predicting Code Coverage without Execution

Code coverage is a widely used metric for quantifying the extent to which program elements, such as statements or branches, are executed during testing. Calculating code coverage is resource-intensive, requiring code building and execution…

Software Engineering · Computer Science 2023-07-26 Michele Tufano , Shubham Chandel , Anisha Agarwal , Neel Sundaresan , Colin Clement

Executing Natural Language-Described Algorithms with Large Language Models: An Investigation

Executing computer programs described in natural language has long been a pursuit of computer science. With the advent of enhanced natural language understanding capabilities exhibited by large language models (LLMs), the path toward this…

Computation and Language · Computer Science 2024-03-15 Xin Zheng , Qiming Zhu , Hongyu Lin , Yaojie Lu , Xianpei Han , Le Sun

Optimizing Large Language Models for OpenAPI Code Completion

Recent advancements in Large Language Models (LLMs) and their utilization in code generation tasks have significantly reshaped the field of software development. Despite the remarkable efficacy of code completion solutions in mainstream…

Software Engineering · Computer Science 2024-06-12 Bohdan Petryshyn , Mantas Lukoševičius

ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?

Although large language models (LLMs) have been largely successful in generating functionally correct programs, conditioning models to produce efficient solutions while ensuring correctness remains a challenge. Further, unreliability in…

Computation and Language · Computer Science 2024-10-11 Siddhant Waghjale , Vishruth Veerendranath , Zora Zhiruo Wang , Daniel Fried

LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation

Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent. However, given NL is informal, it does not lend easily to checking…

Software Engineering · Computer Science 2024-10-04 Sarah Fakhoury , Aaditya Naik , Georgios Sakkas , Saikat Chakraborty , Shuvendu K. Lahiri

Self-Execution Simulation Improves Coding Models

A promising research direction in enabling LLMs to generate consistently correct code involves addressing their inability to properly estimate program execution, particularly for code they generate. In this work, we demonstrate that Code…

Computation and Language · Computer Science 2026-04-07 Gallil Maimon , Ori Yoran , Felix Kreuk , Michael Hassid , Gal Cohen , Pierre Chambon , Yossi Adi

Cerberus: Multi-Agent Reasoning and Coverage-Guided Exploration for Static Detection of Runtime Errors

In several software development scenarios, it is desirable to detect runtime errors and exceptions in code snippets without actual execution. A typical example is to detect runtime exceptions in online code snippets before integrating them…

Software Engineering · Computer Science 2025-12-29 Hridya Dhulipala , Xiaokai Rong , Tien N. Nguyen

Treefix: Enabling Execution with a Tree of Prefixes

The ability to execute code is a prerequisite for various dynamic program analyses. Learning-guided execution has been proposed as an approach to enable the execution of arbitrary code snippets by letting a neural model predict likely…

Software Engineering · Computer Science 2025-01-24 Beatriz Souza , Michael Pradel

Context-Guided Decompilation: A Step Towards Re-executability

Binary decompilation plays an important role in software security analysis, reverse engineering, and malware understanding when source code is unavailable. However, existing decompilation techniques often fail to produce source code that…

Software Engineering · Computer Science 2026-04-14 Xiaohan Wang , Yuxin Hu , Kevin Leach

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Algorithmic reasoning refers to the ability to understand the complex patterns behind the problem and decompose them into a sequence of reasoning steps towards the solution. Such nature of algorithmic reasoning makes it a challenge for…

Computation and Language · Computer Science 2024-04-04 Hyungjoo Chae , Yeonghyeon Kim , Seungone Kim , Kai Tzu-iunn Ong , Beong-woo Kwak , Moohyeon Kim , Seonghwan Kim , Taeyoon Kwon , Jiwan Chung , Youngjae Yu , Jinyoung Yeo

Demystifying Errors in LLM Reasoning Traces: An Empirical Study of Code Execution Simulation

Understanding a program's runtime reasoning behavior, meaning how intermediate states and control flows lead to final execution results, is essential for reliable code generation, debugging, and automated reasoning. Although large language…

Software Engineering · Computer Science 2025-12-02 Mohammad Abdollahi , Khandaker Rifah Tasnia , Soumit Kanti Saha , Jinqiu Yang , Song Wang , Hadi Hemmati

Compiling Code LLMs into Lightweight Executables

The demand for better prediction accuracy and higher execution performance in neural networks continues to grow. The emergence and success of Large Language Models (LLMs) have produced many cloud-based tools for software engineering tasks…

Software Engineering · Computer Science 2026-04-27 Jieke Shi , Junda He , Zhou Yang , Chengran Yang , Mykhailo Klymenko , Thong Hoang , Xiwei Xu , Zhenchang Xing , David Lo

ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation

Code translation is a crucial activity in the software development and maintenance process, and researchers have recently begun to focus on using pre-trained large language models (LLMs) for code translation. However, existing LLMs only…

Software Engineering · Computer Science 2025-09-30 Minghua He , Yue Chen , Fangkai Yang , Pu Zhao , Wenjie Yin , Yu Kang , Qingwei Lin , Saravan Rajmohan , Dongmei Zhang

SIMCOPILOT: Evaluating Large Language Models for Copilot-Style Code Generation

We introduce SIMCOPILOT, a benchmark that simulates the role of large language models (LLMs) as interactive, "copilot"-style coding assistants. Targeting both completion (finishing incomplete methods or code blocks) and infill tasks…

Machine Learning · Computer Science 2025-05-29 Mingchao Jiang , Abhinav Jain , Sophia Zorek , Chris Jermaine

Large Language Model-Driven Code Compliance Checking in Building Information Modeling

This research addresses the time-consuming and error-prone nature of manual code compliance checking in Building Information Modeling (BIM) by introducing a Large Language Model (LLM)-driven approach to semi-automate this critical process.…

Software Engineering · Computer Science 2025-06-26 Soumya Madireddy , Lu Gao , Zia Din , Kinam Kim , Ahmed Senouci , Zhe Han , Yunpeng Zhang

Large Language Models as Code Executors: An Exploratory Study

The capabilities of Large Language Models (LLMs) have significantly evolved, extending from natural language processing to complex tasks like code understanding and generation. We expand the scope of LLMs' capabilities to a broader context,…

Computation and Language · Computer Science 2024-10-11 Chenyang Lyu , Lecheng Yan , Rui Xing , Wenxi Li , Younes Samih , Tianbo Ji , Longyue Wang

CIFE: Code Instruction-Following Evaluation

Large Language Models (LLMs) are increasingly applied to real-world code generation, where functional correctness alone is insufficient for reliable deployment, developers also expect adherence to explicit requirements for robustness,…

Software Engineering · Computer Science 2025-12-22 Sravani Gunnu , Shanmukha Guttula , Hima Patel