Related papers: BDiff: Block-aware and Accurate Text-based Code Di…
Large Language Models (LLMs) are increasingly used for code editing, yet the prevalent full-code generation paradigm suffers from severe efficiency bottlenecks, posing challenges for interactive coding assistants that demand low latency and…
Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, or language runtimes. Specifications for such…
Software defect prediction aims to identify defect-prone code, aiding developers in optimizing testing resource allocation. Most defect prediction approaches primarily focus on coarse-grained, file-level defect prediction, which fails to…
A source code difference (diff) indicates changes made by comparing new and old source codes, and it can be utilized in code reviews to help developers understand the changes made to the code. Although many diff generation methods have been…
Activity diagrams (ADs) have recently become widely used in the modeling of workflows, business processes, and web-services, where they serve various purposes, from documentation, requirement definitions, and test case specifications, to…
Reliable handling of code diffs is central to agents that edit and refactor repositories at scale. We introduce Diff-XYZ, a compact benchmark for code-diff understanding with three supervised tasks: apply (old code $+$ diff $\rightarrow$…
Text spotting has seen tremendous progress in recent years yielding performant techniques which can extract text at the character, word or line level. However, extracting blocks of text from images (block-level text spotting) is relatively…
Recent advances in block diffusion language models have demonstrated competitive performance and strong scalability on reasoning tasks. However, existing BDLMs have limited exploration under the test-time scaling setting and face more…
Bug localization refers to the identification of source code files which is in a programming language and also responsible for the unexpected behavior of software using the bug report, which is a natural language. As bug localization is…
Code generation is increasingly critical for real-world applications. Still, diffusion-based large language models continue to struggle with this demand. Unlike free-form text, code requires syntactic precision; even minor structural…
Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code…
Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to…
Large Language Models (LLMs) have reshaped natural language processing, powering applications from multi-hop retrieval and question answering to autonomous agent workflows. Yet, prompt engineering -- the task of crafting textual inputs to…
Software undergoes constant changes to support new requirements, address bugs, enhance performance, and ensure maintainability. Thus, developers spend a great portion of their workday trying to understand and review the code changes of…
Deep Learning methods are becoming prominent in automated software bug detection; however, they lack the global understanding of the given code. Consequently, their performance tends to degrade, especially when they are applied to large…
Large Language Models (LLMs) have demonstrated remarkable capabilities in code editing, substantially enhancing software development productivity. However, the inherent complexity of code editing tasks forces existing approaches to rely on…
Teachers face several challenges when presenting the fundamental concepts of programming in the classroom. Several tools are introduced to give a visual dimension to support the learning process. These tools rely on code blocks, easily…
Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating codes. Due to these capabilities, many recent methods are proposed to automatically refine the codes with LLMs. However, we should…
Code review is a key development practice that contributes to improve software quality and to foster knowledge sharing among developers. However, code review usually takes time and demands detailed and time-consuming analysis of textual…
This study investigates the problem of learning linear block codes optimized for Belief-Propagation decoders significantly improving performance compared to the state-of-the-art. Our previous research is extended with an enhanced system…