English
Related papers

Related papers: CoaCor: Code Annotation for Code Retrieval with Re…

200 papers

In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems…

Information Retrieval · Computer Science 2020-08-28 Geert Heyman , Tom Van Cutsem

Code retrieval helps developers reuse the code snippet in the open-source projects. Given a natural language description, code retrieval aims to search for the most relevant code among a set of code. Existing state-of-the-art approaches…

Computation and Language · Computer Science 2020-08-21 Qihao Zhu , Zeyu Sun , Xiran Liang , Yingfei Xiong , Lu Zhang

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly…

Software Engineering · Computer Science 2022-03-16 Shuai Lu , Nan Duan , Hojae Han , Daya Guo , Seung-won Hwang , Alexey Svyatkovskiy

Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from…

Software Engineering · Computer Science 2021-10-19 Mehdi Bahrami , N. C. Shrikanth , Yuji Mizobuchi , Lei Liu , Masahiro Fukuyori , Wei-Peng Chen , Kazuki Munakata

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

Source code search plays an important role in software development, e.g. for exploratory development or opportunistic reuse of existing code from a code base. Often, exploration of different implementations with the same functionality is…

Software Engineering · Computer Science 2021-06-08 Muhammad Hammad , Önder Babur , Hamid Abdul Basit , Mark van den Brand

Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query. Since both tasks aim to model the association between natural…

Information Retrieval · Computer Science 2020-02-26 Wei Ye , Rui Xie , Jinglei Zhang , Tianxiang Hu , Xiaoyin Wang , Shikun Zhang

Pre-trained language models have achieved promising success in code retrieval tasks, where a natural language documentation query is given to find the most relevant existing code snippet. However, existing models focus only on optimizing…

Software Engineering · Computer Science 2022-12-22 Dong Li , Yelong Shen , Ruoming Jin , Yi Mao , Kuan Wang , Weizhu Chen

Code retrieval is a key task aiming to match natural and programming languages. In this work, we propose adversarial learning for code retrieval, that is regularized by question-description relevance. First, we adapt a simple adversarial…

Computation and Language · Computer Science 2020-11-11 Jie Zhao , Huan Sun

In models to generate program source code from natural language, representing this code in a tree structure has been a common approach. However, existing methods often fail to generate complex code correctly due to a lack of ability to…

Computation and Language · Computer Science 2018-08-31 Shirley Anugrah Hayati , Raphael Olivier , Pravalika Avvaru , Pengcheng Yin , Anthony Tomasic , Graham Neubig

Finding codes given natural language query isb eneficial to the productivity of software developers. Future progress towards better semantic matching between query and code requires richer supervised training resources. To remedy this, we…

Computation and Language · Computer Science 2021-05-28 Junjie Huang , Duyu Tang , Linjun Shou , Ming Gong , Ke Xu , Daxin Jiang , Ming Zhou , Nan Duan

Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or…

Software Engineering · Computer Science 2021-09-13 Md Rizwan Parvez , Wasi Uddin Ahmad , Saikat Chakraborty , Baishakhi Ray , Kai-Wei Chang

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng

Large Language Models (LLMs) demonstrate strong capabilities in general coding tasks but encounter two key challenges when optimizing code: (i) the complexity of writing optimized code (such as performant CUDA kernels and competition-level…

Machine Learning · Computer Science 2026-01-12 Jiefu Ou , Sapana Chaudhary , Kaj Bostrom , Nathaniel Weir , Shuai Zhang , Huzefa Rangwala , George Karypis

Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A…

Machine Learning · Computer Science 2024-01-24 Marcelo de Rezende Martins , Marco A. Gerosa

As text and code resources have expanded, large-scale pre-trained models have shown promising capabilities in code generation tasks, typically employing supervised fine-tuning with problem statement-program pairs. However, increasing model…

Computation and Language · Computer Science 2025-04-10 Nathanaël Beau , Benoît Crabbé

Pretrained language models have shown strong effectiveness in code-related tasks, such as code retrieval, code generation, code summarization, and code completion tasks. In this paper, we propose COde assistaNt viA retrieval-augmeNted…

Computation and Language · Computer Science 2024-11-05 Xinze Li , Hanbin Wang , Zhenghao Liu , Shi Yu , Shuo Wang , Yukun Yan , Yukai Fu , Yu Gu , Ge Yu

This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during…

Software Engineering · Computer Science 2024-04-17 Anthony Saieva , Saikat Chakraborty , Gail Kaiser

A prerequisite for coding agents to perform tasks on large repositories is code localization - the identification of relevant files, classes, and functions to work on. While repository-level code localization has been performed using…

Modern causal language models, followed by rapid developments in discrete diffusion models, can now produce a wide variety of interesting and useful content. However, these families of models are predominantly trained to output tokens with…

Computation and Language · Computer Science 2025-08-19 Long Ma , Fangwei Zhong , Yizhou Wang
‹ Prev 1 2 3 10 Next ›