English
Related papers

Related papers: Neural Code Search Evaluation Dataset

200 papers

Code search is a task to find programming codes that semantically match the given natural language queries. Even though some of the existing datasets for this task are multilingual on the programming language side, their query data are only…

Computation and Language · Computer Science 2023-06-28 Ryo Sekizawa , Nan Duan , Shuai Lu , Hitomi Yanaka

The performance of neural code search is significantly influenced by the quality of the training data from which the neural models are derived. A large corpus of high-quality query and code pairs is demanded to establish a precise mapping…

Software Engineering · Computer Science 2022-02-15 Zhensu Sun , Li Li , Yan Liu , Xiaoning Du , Li Li

In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems…

Information Retrieval · Computer Science 2020-08-28 Geert Heyman , Tom Van Cutsem

Millions of repetitive code snippets are submitted to code repositories every day. To search from these large codebases using simple natural language queries would allow programmers to ideate, prototype, and develop easier and faster.…

Semantic code search is the task of retrieving relevant code given a natural language query. While related to other information retrieval tasks, it requires bridging the gap between the language used in code (often abbreviated and highly…

Machine Learning · Computer Science 2020-06-09 Hamel Husain , Ho-Hsiang Wu , Tiferet Gazit , Miltiadis Allamanis , Marc Brockschmidt

Code search is an important and well-studied task, but it usually means searching for code by a text query. We argue that using a code snippet (and possibly an error traceback) as a query while looking for bugfixing instructions and code…

Computation and Language · Computer Science 2024-05-28 Ivan Sedykh , Dmitry Abulkhanov , Nikita Sorokin , Sergey Nikolenko , Valentin Malykh

Code writing is repetitive and predictable, inspiring us to develop various code intelligence techniques. This survey focuses on code search, that is, to retrieve code that matches a given query by effectively capturing the semantic…

Software Engineering · Computer Science 2023-12-14 Yutao Xie , Jiayi Lin , Hande Dong , Lei Zhang , Zhonghai Wu

(Source) code search is widely concerned by software engineering researchers because it can improve the productivity and quality of software development. Given a functionality requirement usually described in a natural language sentence, a…

Software Engineering · Computer Science 2023-11-14 Weisong Sun , Chunrong Fang , Yifei Ge , Yuling Hu , Yuchen Chen , Quanjun Zhang , Xiuting Ge , Yang Liu , Zhenyu Chen

Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from…

Software Engineering · Computer Science 2021-10-19 Mehdi Bahrami , N. C. Shrikanth , Yuji Mizobuchi , Lei Liu , Masahiro Fukuyori , Wei-Peng Chen , Kazuki Munakata

Language models can serve as a valuable tool for software developers to increase productivity. Large generative models can be used for code generation and code completion, while smaller encoder-only models are capable of performing code…

Computation and Language · Computer Science 2023-11-17 Andor Diera , Abdelhalim Dahou , Lukas Galke , Fabian Karl , Florian Sihler , Ansgar Scherp

With the recent explosion in the size and complexity of source codebases and software projects, the need for efficient source code search engines has increased dramatically. Unfortunately, existing information retrieval-based methods fail…

Software Engineering · Computer Science 2021-09-27 Muhammad Khalifa

Translation between natural language and source code can help software development by enabling developers to comprehend, ideate, search, and write computer programs in natural language. Despite growing interest from the industry and the…

Code search is a core software engineering task. Effective code search tools can help developers substantially improve their software development efficiency and effectiveness. In recent years, many code search studies have leveraged…

Software Engineering · Computer Science 2021-10-12 Chao Liu , Xin Xia , David Lo , Cuiyun Gao , Xiaohu Yang , John Grundy

The immense amounts of source code provide ample challenges and opportunities during software development. To handle the size of code bases, developers commonly search for code, e.g., when trying to find where a particular feature is…

Software Engineering · Computer Science 2022-10-06 Luca Di Grazia , Michael Pradel

Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is…

Software Engineering · Computer Science 2024-01-02 Yao Wan , Yang He , Zhangqian Bi , Jianguo Zhang , Hongyu Zhang , Yulei Sui , Guandong Xu , Hai Jin , Philip S. Yu

Semantic code search is the task of retrieving relevant code snippet given a natural language query. Different from typical information retrieval tasks, code search requires to bridge the semantic gap between the programming language and…

Computation and Language · Computer Science 2022-01-28 Chen Wu , Ming Yan

The ability to match pieces of code to their corresponding natural language descriptions and vice versa is fundamental for natural language search interfaces to software repositories. In this paper, we propose a novel multi-perspective…

Software Engineering · Computer Science 2024-04-12 Rajarshi Haldar , Lingfei Wu , Jinjun Xiong , Julia Hockenmaier

While scaling training compute has led to remarkable improvements in large language models (LLMs), scaling inference compute has not yet yielded analogous gains. We hypothesize that a core missing component is a lack of diverse LLM outputs,…

Machine Learning · Computer Science 2024-10-22 Evan Wang , Federico Cassano , Catherine Wu , Yunfeng Bai , Will Song , Vaskar Nath , Ziwen Han , Sean Hendryx , Summer Yue , Hugh Zhang

Source Code Summarization is the task of writing short, natural language descriptions of source code. The main use for these descriptions is in software documentation e.g. the one-sentence Java method descriptions in JavaDocs. Code…

Computation and Language · Computer Science 2019-04-05 Alexander LeClair , Collin McMillan

Reusing existing datasets is of considerable significance to researchers and developers. Dataset search engines help a user find relevant datasets for reuse. They can present a snippet for each retrieved dataset to explain its relevance to…

Information Retrieval · Computer Science 2019-07-03 Xiaxia Wang , Jinchi Chen , Shuxin Li , Gong Cheng , Jeff Z. Pan , Evgeny Kharlamov , Yuzhong Qu
‹ Prev 1 2 3 10 Next ›