Related papers: Pointing to Subwords for Generating Function Names…
Recent neural sequence-to-sequence models with a copy mechanism have achieved remarkable progress in various text generation tasks. These models addressed out-of-vocabulary problems and facilitated the generation of rare words. However, the…
Summary descriptions of subroutines are short (usually one-sentence) natural language explanations of a subroutine's behavior and purpose in a program. These summaries are ubiquitous in documentation, and many tools such as JavaDocs and…
Source code summarization is the task of creating short, natural language descriptions of source code. Code summarization is the backbone of much software documentation such as JavaDocs, in which very brief comments such as "adds the…
We find generating functions the number of strings (words) containing a specified number of occurrences of certain types of order-isomorphic classes of substrings called subword patterns. In particular, we find generating functions for the…
Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they…
High quality method names are descriptive and readable, which are helpful for code development and maintenance. The majority of recent research suggest method names based on the text summarization approach. They take the token sequence and…
Good comments help developers understand software faster and provide better maintenance. However, comments are often missing, generally inaccurate, or out of date. Many of these problems can be avoided by automatic comment generation. This…
Source Code Summarization is the task of writing short, natural language descriptions of source code. The main use for these descriptions is in software documentation e.g. the one-sentence Java method descriptions in JavaDocs. Code…
Automated source code refactoring, particularly extract method refactoring, is a crucial and frequently employed technique during software development. Despite its importance and frequent use by practitioners, current automated techniques…
In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to…
Code summarization is the task of generating natural language description of source code, which is important for program understanding and maintenance. Existing approaches treat the task as a machine translation problem (e.g., from Java to…
Refactoring is an indispensable practice of improving the quality and maintainability of source code in software evolution. Rename refactoring is the most frequently performed refactoring that suggests a new name for an identifier to…
Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance.…
Statistical language modeling techniques have successfully been applied to source code, yielding a variety of new software development tools, such as tools for code suggestion and improving readability. A major issue with these techniques…
In retrieval-augmented generation (RAG) question answering systems, generating citations for large language model (LLM) outputs enhances verifiability and helps users identify potential hallucinations. However, we observe two problems in…
We study the new problem of Huffman-like codes subject to individual restrictions on the code-word lengths of a subset of the source words. These are prefix codes with minimal expected code-word length for a random source where additionally…
Pseudo code is one of the valuable artifacts to comprehending the complex program codes. Most of the source code still has no equivalent pseudo code, due to the time-consuming process of writing pseudo codes. In this work, we have developed…
Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive…
We study the problem of generating source code in a strongly typed, Java-like programming language, given a label (for example a set of API calls or types) carrying a small amount of information about the code that is desired. The generated…
The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing…