Related papers: Code Semantic Zooming
Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks…
Large Language Models (LLMs) have emerged as coding assistants, capable of generating source code from natural language prompts. With the increasing adoption of LLMs in software development, academic research and industry based projects are…
Programmers increasingly rely on Large Language Models (LLMs) for code generation. However, misalignment between programmers' goals and generated code complicates the code evaluation process and demands frequent switching between prompt…
Large Language Models (LLMs) have shown promising performance in code generation. However, how to reliably evaluate code generated by LLMs remains an unresolved problem. This paper presents CodeJudge, a code evaluation framework that…
Many software analysis methods have come to rely on machine learning approaches. Code segmentation - the process of decomposing source code into meaningful blocks - can augment these methods by featurizing code, reducing noise, and limiting…
With the growth of natural language processing techniques and demand for improved software engineering efficiency, there is an emerging interest in translating intention from human languages to programming languages. In this survey paper,…
Code review is one of the key processes in the software development lifecycle and is essential to maintain code quality. However, manual code review is subjective and time consuming. Given its rule-based nature, code review is well suited…
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This…
Large language models (LLMs) have showcased remarkable prowess in code generation. However, automated code generation is still challenging since it requires a high-level semantic mapping between natural language requirements and codes. Most…
Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…
Although large language models (LLMs) show promising potential in code translation, they still struggle to generate accurate translations using the commonly adopted direct code-to-code translation approach, which converts an original…
Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…
With the emergence and rapid evolution of large language models (LLM), automating coding tasks has become an important research topic. Many efforts are underway and literature abounds about the efficacy of models and their ability to…
Large Language Models (LLMs) have made significant strides in code generation and problem solving. Current approaches employ external tool-based iterative debuggers that use compiler or other tool-based runtime feedback to refine coarse…
In large language models (LLMs), code and reasoning reinforce each other: code offers an abstract, modular, and logic-driven structure that supports reasoning, while reasoning translates high-level goals into smaller, executable steps that…
Code generation agents powered by large language models (LLMs) are revolutionizing the software development paradigm. Distinct from previous code generation techniques, code generation agents are characterized by three core features. 1)…
Automating the decision of whether a code change requires manual review is vital for maintaining software quality in modern development workflows. However, the emergence of new programming languages and frameworks creates a critical…
Source code segmentation, dividing code into functionally coherent segments, is crucial for knowledge retrieval and maintenance in software development. While enabling efficient navigation and comprehension of large codebases, manual and…
Code generation aims to automatically generate source code from high-level task specifications, which can significantly increase productivity of software engineering. Recently, approaches based on large language models (LLMs) have shown…
Large language models (LLMs) have shown remarkable ability to generate code, yet their outputs often violate syntactic or semantic constraints when guided only through natural language prompts. We introduce TreeCoder, the most general and…