English
Related papers

Related papers: Constructing Multilingual Code Search Dataset Usin…

200 papers

There has been an increase of interest in code search using natural language. Assessing the performance of such code search models can be difficult without a readily available evaluation suite. In this paper, we present an evaluation…

Software Engineering · Computer Science 2019-10-03 Hongyu Li , Seohyun Kim , Satish Chandra

Code translation aims to convert code from one programming language to another automatically. It is motivated by the need for multi-language software development and legacy system migration. In recent years, neural code translation has…

Software Engineering · Computer Science 2025-05-13 Xiang Chen , Jiacheng Xue , Xiaofei Xie , Caokai Liang , Xiaolin Ju

The performance of neural code search is significantly influenced by the quality of the training data from which the neural models are derived. A large corpus of high-quality query and code pairs is demanded to establish a precise mapping…

Software Engineering · Computer Science 2022-02-15 Zhensu Sun , Li Li , Yan Liu , Xiaoning Du , Li Li

Neural machine translation (NMT) methods developed for natural language processing have been shown to be highly successful in automating translation from one natural language to another. Recently, these NMT methods have been adapted to the…

Computation and Language · Computer Science 2023-05-24 Dharma KC , Clayton T. Morrison

This paper presents a high-quality multilingual dataset for the documentation domain to advance research on localization of structured text. Unlike widely-used datasets for translation of plain text, we collect XML-structured parallel text…

Computation and Language · Computer Science 2020-06-25 Kazuma Hashimoto , Raffaella Buschiazzo , James Bradbury , Teresa Marshall , Richard Socher , Caiming Xiong

Code search is vital in the maintenance and extension of software systems. Past works have used separate language models for the natural language and programming language artifacts on models with multiple encoders and different loss…

Software Engineering · Computer Science 2024-10-31 Monoshiz Mahbub Khan , Zhe Yu

The Transformer architecture and transfer learning have marked a quantum leap in natural language processing, improving the state of the art across a range of text-based tasks. This paper examines how these advancements can be applied to…

Software Engineering · Computer Science 2022-08-29 Pasquale Salza , Christoph Schwizer , Jian Gu , Harald C. Gall

Consider the case where a programmer has written some part of a program, but has left part of the program (such as a method or a function body) incomplete. The goal is to use the context surrounding the missing code to automatically 'figure…

Software Engineering · Computer Science 2020-07-28 Rohan Mukherjee , Swarat Chaudhuri , Chris Jermaine

Millions of repetitive code snippets are submitted to code repositories every day. To search from these large codebases using simple natural language queries would allow programmers to ideate, prototype, and develop easier and faster.…

The ability to match pieces of code to their corresponding natural language descriptions and vice versa is fundamental for natural language search interfaces to software repositories. In this paper, we propose a novel multi-perspective…

Software Engineering · Computer Science 2024-04-12 Rajarshi Haldar , Lingfei Wu , Jinjun Xiong , Julia Hockenmaier

Translating source code from one programming language to another is a critical, time-consuming task in modernizing legacy applications and codebases. Recent work in this space has drawn inspiration from the software naturalness hypothesis…

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning…

Software Engineering · Computer Science 2022-02-17 Weisong Sun , Chunrong Fang , Yuchen Chen , Guanhong Tao , Tingxu Han , Quanjun Zhang

Translation between natural language and source code can help software development by enabling developers to comprehend, ideate, search, and write computer programs in natural language. Despite growing interest from the industry and the…

We present The Vault, a dataset of high-quality code-text pairs in multiple programming languages for training large language models to understand and generate code. We present methods for thoroughly extracting samples that use both…

Computation and Language · Computer Science 2023-10-31 Dung Nguyen Manh , Nam Le Hai , Anh T. V. Dau , Anh Minh Nguyen , Khanh Nghiem , Jin Guo , Nghi D. Q. Bui

While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric. This creates a barrier for…

Computation and Language · Computer Science 2023-02-08 Zhiruo Wang , Grace Cuenca , Shuyan Zhou , Frank F. Xu , Graham Neubig

Large language models are becoming increasingly practical for translating code across programming languages, a process known as $transpiling$. Even though automated transpilation significantly boosts developer productivity, a key concern is…

Software Engineering · Computer Science 2024-01-31 Hasan Ferit Eniser , Valentin Wüstholz , Maria Christakis

Reimplementing solutions to previously solved software engineering problems is not only inefficient but also introduces inadequate and error-prone code. Many existing methods achieve impressive performance on this issue by using…

Software Engineering · Computer Science 2022-10-04 Usama Nadeem , Noah Ziems , Shaoen Wu

Semantic code search is about finding semantically relevant code snippets for a given natural language query. In the state-of-the-art approaches, the semantic similarity between code and query is quantified as the distance of their…

Software Engineering · Computer Science 2022-01-14 Jian Gu , Zimin Chen , Martin Monperrus

Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code…

Recently, the automated translation of source code from one programming language to another by using automatic approaches inspired by Neural Machine Translation (NMT) methods for natural languages has come under study. However, such…

‹ Prev 1 2 3 10 Next ›