Related papers: Constraint-Guided Multi-Agent Decompilation for Ex…

A Neural-based Program Decompiler

Reverse engineering of binary executables is a critical problem in the computer security domain. On the one hand, malicious parties may recover interpretable source codes from the software products to gain commercial advantages. On the…

Programming Languages · Computer Science 2019-07-01 Cheng Fu , Huili Chen , Haolan Liu , Xinyun Chen , Yuandong Tian , Farinaz Koushanfar , Jishen Zhao

CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference

Binary decompilation is a critical reverse engineering task aimed at reconstructing high-level source code from stripped executables. Although Large Language Models (LLMs) have recently shown promise, they often suffer from "logical…

Software Engineering · Computer Science 2026-04-15 Qiang Zhang , Zhongnian Li

DOCE: Finding the Sweet Spot for Execution-Based Code Generation

Recently, a diverse set of decoding and reranking procedures have been shown effective for LLM-based code generation. However, a comprehensive framework that links and experimentally compares these methods is missing. We address this by…

Computation and Language · Computer Science 2024-10-17 Haau-Sing Li , Patrick Fernandes , Iryna Gurevych , André F. T. Martins

Automatically Mitigating Vulnerabilities in Binary Programs via Partially Recompilable Decompilation

Vulnerabilities are challenging to locate and repair, especially when source code is unavailable and binary patching is required. Manual methods are time-consuming, require significant expertise, and do not scale to the rate at which new…

Cryptography and Security · Computer Science 2023-06-13 Pemma Reiter , Hui Jun Tay , Westley Weimer , Adam Doupé , Ruoyu Wang , Stephanie Forrest

LLM4Decompile: Decompiling Binary Code with Large Language Models

Decompilation aims to convert binary code to high-level source code, but traditional tools like Ghidra often produce results that are difficult to read and execute. Motivated by the advancements in Large Language Models (LLMs), we propose…

Programming Languages · Computer Science 2025-08-06 Hanzhuo Tan , Qi Luo , Jing Li , Yuqun Zhang

PCodeTrans: Translate Decompiled Pseudocode to Compilable and Executable Equivalent

Decompilation is foundational to binary analysis, yet conventional tools prioritize human readability over strict recompilability and verifiable runtime correctness. While recent LLM-based approaches attempt to refine decompiled pseudocode,…

Software Engineering · Computer Science 2026-03-17 Yuxin Cui , Zeyu Gao , Shuxian He , Siliang Qin , Chao Zhang

ReF Decompile: Relabeling and Function Call Enhanced Decompile

The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages, enabling analysis in scenarios where source code is unavailable. This task supports various reverse…

Software Engineering · Computer Science 2025-02-19 Yunlong Feng , Bohan Li , Xiaoming Shi , Qingfu Zhu , Wanxiang Che

Learning to Find Usages of Library Functions in Optimized Binaries

Much software, whether beneficent or malevolent, is distributed only as binaries, sans source code. Absent source code, understanding binaries' behavior can be quite challenging, especially when compiled under higher levels of compiler…

Software Engineering · Computer Science 2021-09-20 Toufique Ahmed , Premkumar Devanbu , Anand Ashok Sawant

Decaf: Improving Neural Decompilation with Automatic Feedback and Search

Decompilers are useful tools used in reverse engineering to understand compiled source code. Reconstructing source code from compiled binaries is a challenging task, because high-level syntax, identifiers, and custom data types are…

Software Engineering · Computer Science 2026-05-13 Alexander Shypula , Osbert Bastani , Edward Schwartz

DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios

Decompilers are fundamental tools for critical security tasks, from vulnerability discovery to malware analysis, yet their evaluation remains fragmented. Existing approaches primarily focus on syntactic correctness through synthetic…

Software Engineering · Computer Science 2025-05-19 Zeyu Gao , Yuxin Cui , Hao Wang , Siliang Qin , Yuanda Wang , Bolun Zhang , Chao Zhang

CODEFUSE-DEBENCH: An Empirical Study on Readability, Recompilability, and Functionality

Binary decompilation aims to recover binaries into high-level source code, but existing evaluations mainly rely on syntactic similarity or single-axis readability metrics, which fail to capture practical reusability. We propose a…

Software Engineering · Computer Science 2026-05-29 Puzhuo Liu , Yuhan Huang , Jianlei Chi , Peng Di , Yu Jiang

LLM4CodeRE: Generative AI for Code Decompilation Analysis and Reverse Engineering

Code decompilation analysis is a fundamental yet challenging task in malware reverse engineering, particularly due to the pervasive use of sophisticated obfuscation techniques. Although recent large language models (LLMs) have shown promise…

Cryptography and Security · Computer Science 2026-04-08 Hamed Jelodar , Samita Bai , Tochukwu Emmanuel Nwankwo , Parisa Hamedi , Mohammad Meymani , Roozbeh Razavi-Far , Ali A. Ghorbani

Improving type information inferred by decompilers with supervised machine learning

In software reverse engineering, decompilation is the process of recovering source code from binary files. Decompilers are used when it is necessary to understand or analyze software for which the source code is not available. Although…

Software Engineering · Computer Science 2021-02-25 Javier Escalada , Ted Scully , Francisco Ortin

SALT4Decompile: Inferring Source-level Abstract Logic Tree for LLM-Based Binary Decompilation

Decompilation is widely used in reverse engineering to recover high-level language code from binary executables. While recent approaches leveraging Large Language Models (LLMs) have shown promising progress, they typically treat assembly…

Software Engineering · Computer Science 2025-09-19 Yongpan Wang , Xin Xu , Xiaojie Zhu , Xiaodong Gu , Beijun Shen

Refining Decompiled C Code with Large Language Models

A C decompiler converts an executable into source code. The recovered C source code, once re-compiled, is expected to produce an executable with the same functionality as the original executable. With over twenty years of development, C…

Software Engineering · Computer Science 2023-11-30 Wai Kin Wong , Huaijin Wang , Zongjie Li , Zhibo Liu , Shuai Wang , Qiyi Tang , Sen Nie , Shi Wu

Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation

Recent advances in LLM-based decompilers have been shown effective to convert low-level binaries into human-readable source code. However, there still lacks a comprehensive benchmark that provides large-scale binary-source function pairs,…

Software Engineering · Computer Science 2025-10-21 Hanzhuo Tan , Xiaolong Tian , Hanrui Qi , Jiaming Liu , Zuchen Gao , Siyi Wang , Qi Luo , Jing Li , Yuqun Zhang

D-LiFT: Improving LLM-based Decompiler Backend via Code Quality-driven Fine-tuning

As one of the key tools in many security tasks, decompilers reconstruct human-readable source code from binaries. Yet, despite recent advances, their outputs often suffer from syntactic and semantic errors and remain difficult to read.…

Cryptography and Security · Computer Science 2025-08-19 Muqi Zou , Hongyu Cai , Hongwei Wu , Zion Leonahenahe Basque , Arslan Khan , Berkay Celik , Dave , Tian , Antonio Bianchi , Ruoyu , Wang , Dongyan Xu

RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance

Large Language Models (LLMs) have shown incredible potential in code generation tasks, and recent research in prompt engineering have enhanced LLMs' understanding of textual information. However, ensuring the accuracy of generated code…

Software Engineering · Computer Science 2024-10-04 Haolin Jin , Zechao Sun , Huaming Chen

"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution

Recent advances in coding agents have shown remarkable progress in software issue resolution. In practice, real-world issues are typically bug fixes or feature requests in which human developers naturally incorporate refactoring as part of…

Software Engineering · Computer Science 2026-05-22 Zhao Tian , Zifan Zhang , Tao Xiao , Dong Wang , Masanari Kondo , Junjie Chen , Yasutaka Kamei

SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

Instructed code editing is a significant challenge for large language models (LLMs). On the EditBench benchmark, 39 of 40 evaluated models obtain a task success rate (TSR) below 60 percent, highlighting a gap between general code generation…

Software Engineering · Computer Science 2026-04-29 Noam Tarshish , Nofar Selouk , Daniel Hodisan , Bar Ezra Gafniel , Yuval Elovici , Asaf Shabtai , Eliya Nachmani