Related papers: BDiff: Block-aware and Accurate Text-based Code Di…

To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

Large Language Models (LLMs) are increasingly used for code editing, yet the prevalent full-code generation paradigm suffers from severe efficiency bottlenecks, posing challenges for interactive coding assistants that demand low latency and…

Software Engineering · Computer Science 2026-05-01 Wei Cheng , Yongchang Cao , Chen Shen , Binhua Li , Jue Chen , Yongbin Li , Wei Hu

DiffSpec: Differential Testing with LLMs using Natural Language Specifications and Code Artifacts

Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, or language runtimes. Specifications for such…

Software Engineering · Computer Science 2025-05-07 Nikitha Rao , Elizabeth Gilbert , Harrison Green , Tahina Ramananandro , Nikhil Swamy , Claire Le Goues , Sarah Fakhoury

BAFLineDP: Code Bilinear Attention Fusion Framework for Line-Level Defect Prediction

Software defect prediction aims to identify defect-prone code, aiding developers in optimizing testing resource allocation. Most defect prediction approaches primarily focus on coarse-grained, file-level defect prediction, which fails to…

Software Engineering · Computer Science 2024-02-13 Shaojian Qiu , Huihao Huang , Jianxiang Luo , Yingjie Kuang , Haoyu Luo

Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance

A source code difference (diff) indicates changes made by comparing new and old source codes, and it can be utilized in code reviews to help developers understand the changes made to the code. Although many diff generation methods have been…

Software Engineering · Computer Science 2024-09-27 Tsukasa Yagi , Shinpei Hayashi

ADDiff: Semantic Differencing for Activity Diagrams

Activity diagrams (ADs) have recently become widely used in the modeling of workflows, business processes, and web-services, where they serve various purposes, from documentation, requirement definitions, and test case specifications, to…

Software Engineering · Computer Science 2014-09-09 Shahar Maoz , Jan Oliver Ringert , Bernhard Rumpe

Diff-XYZ: A Benchmark for Evaluating Diff Understanding

Reliable handling of code diffs is central to agents that edit and refactor repositories at scale. We introduce Diff-XYZ, a compact benchmark for code-diff understanding with three supervised tasks: apply (old code $+$ diff $\rightarrow$…

Software Engineering · Computer Science 2025-11-18 Evgeniy Glukhov , Michele Conti , Egor Bogomolov , Yaroslav Golubev , Alexander Bezzubov

Block-level Text Spotting with LLMs

Text spotting has seen tremendous progress in recent years yielding performant techniques which can extract text at the character, word or line level. However, extracting blocks of text from images (block-level text spotting) is relatively…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Ganesh Bannur , Bharadwaj Amrutur

Advancing Block Diffusion Language Models for Test-Time Scaling

Recent advances in block diffusion language models have demonstrated competitive performance and strong scalability on reasoning tasks. However, existing BDLMs have limited exploration under the test-time scaling setting and face more…

Computation and Language · Computer Science 2026-02-12 Yi Lu , Deyang Kong , Jianing Wang , Linsen Guo , Xue Wang , Qi Guo , Tao Gui , Xuanjing Huang , Wei Ye , Shikun Zhang , Wei Wang

Aligning Programming Language and Natural Language: Exploring Design Choices in Multi-Modal Transformer-Based Embedding for Bug Localization

Bug localization refers to the identification of source code files which is in a programming language and also responsible for the unexpected behavior of software using the bug report, which is a natural language. As bug localization is…

Software Engineering · Computer Science 2024-06-26 Partha Chakraborty , Venkatraman Arumugam , Meiyappan Nagappan

TreeDiff: AST-Guided Code Generation with Diffusion LLMs

Code generation is increasingly critical for real-world applications. Still, diffusion-based large language models continue to struggle with this demand. Unlike free-form text, code requires syntactic precision; even minor structural…

Computation and Language · Computer Science 2026-01-07 Yiming Zeng , Jinghan Cao , Zexin Li , Yiming Chen , Tao Ren , Zhuochun Li , Dawei Xiang , Xidong Wu , Shangqian Gao , Tingting Yu

BoostNSift: A Query Boosting and Code Sifting Technique for Method Level Bug Localization

Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code…

Software Engineering · Computer Science 2021-08-31 Abdul Razzaq , Jim Buckley , James Vincent Patten , Muslim Chochlov , Ashish Rajendra Sai

RefDiff: Detecting Refactorings in Version Histories

Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to…

Software Engineering · Computer Science 2018-08-07 Danilo Silva , Marco Tulio Valente

LLM-AutoDiff: Auto-Differentiate Any LLM Workflow

Large Language Models (LLMs) have reshaped natural language processing, powering applications from multi-hop retrieval and question answering to autonomous agent workflows. Yet, prompt engineering -- the task of crafting textual inputs to…

Computation and Language · Computer Science 2025-01-31 Li Yin , Zhangyang Wang

A Novel Refactoring and Semantic Aware Abstract Syntax Tree Differencing Tool and a Benchmark for Evaluating the Accuracy of Diff Tools

Software undergoes constant changes to support new requirements, address bugs, enhance performance, and ensure maintainability. Thus, developers spend a great portion of their workday trying to understand and review the code changes of…

Software Engineering · Computer Science 2024-08-26 Pouria Alikhanifard , Nikolaos Tsantalis

FGDM: Reasoning Aware Multi-Agentic Framework for Software Bug Detection using Chain of Thought and Tree of Thought Prompting

Deep Learning methods are becoming prominent in automated software bug detection; however, they lack the global understanding of the given code. Consequently, their performance tends to degrade, especially when they are applied to large…

Software Engineering · Computer Science 2026-04-29 Srita Padmanabhuni , Bhargavi Karuturi , Jerusha Karen Indupalli , Santhan Reddy Chilla , Vivek Yelleti

EfficientEdit: Accelerating Code Editing via Edit-Oriented Speculative Decoding

Large Language Models (LLMs) have demonstrated remarkable capabilities in code editing, substantially enhancing software development productivity. However, the inherent complexity of code editing tasks forces existing approaches to rely on…

Software Engineering · Computer Science 2025-10-01 Peiding Wang , Li Zhang , Fang Liu , Yinghao Zhu , Wang Xu , Lin Shi , Xiaoli Lian , Minxiao Li , Bo Shen , An Fu

Comparison of block-based and hybrid-based environments in transferring programming skills to text-based environments

Teachers face several challenges when presenting the fundamental concepts of programming in the classroom. Several tools are introduced to give a visual dimension to support the learning process. These tools rely on code blocks, easily…

Computers and Society · Computer Science 2019-11-22 Hussein Alrubaye , Stephanie Ludi , Mohamed Wiem Mkaouer

Rethinking Code Refinement: Learning to Judge Code Efficiency

Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating codes. Due to these capabilities, many recent methods are proposed to automatically refine the codes with LLMs. However, we should…

Software Engineering · Computer Science 2024-10-31 Minju Seo , Jinheon Baek , Sung Ju Hwang

RAID: Tool Support for Refactoring-Aware Code Reviews

Code review is a key development practice that contributes to improve software quality and to foster knowledge sharing among developers. However, code review usually takes time and demands detailed and time-consuming analysis of textual…

Software Engineering · Computer Science 2021-03-23 Rodrigo Brito , Marco Tulio Valente

Learning Linear Block Codes with Gradient Quantization

This study investigates the problem of learning linear block codes optimized for Belief-Propagation decoders significantly improving performance compared to the state-of-the-art. Our previous research is extended with an enhanced system…

Signal Processing · Electrical Eng. & Systems 2025-10-02 Louis-Adrien Dufrène , Quentin Lampin , Guillaume Larue