English
Related papers

Related papers: OCoR: An Overlapping-Aware Code Retriever

200 papers

To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code…

Software Engineering · Computer Science 2019-04-02 Ziyu Yao , Jayavardhan Reddy Peddamail , Huan Sun

Despite the substantial success of Information Retrieval (IR) in various NLP tasks, most IR systems predominantly handle queries and corpora in natural language, neglecting the domain of code retrieval. Code retrieval is critically…

Information Retrieval · Computer Science 2025-06-09 Xiangyang Li , Kuicai Dong , Yi Quan Lee , Wei Xia , Hao Zhang , Xinyi Dai , Yasheng Wang , Ruiming Tang

Developers often depend on code search engines to obtain solutions for their programming tasks. However, finding an expected solution containing code examples along with their explanations is challenging due to several issues. There is a…

Software Engineering · Computer Science 2021-08-06 Rodrigo F. Silva , M. Masudur Rahman , Carlos Eduardo Dantas , Chanchal Roy , Foutse Khomh , Marcelo A. Maia

Code search, framed as information retrieval (IR), underpins modern software engineering and increasingly powers retrieval-augmented generation (RAG), improving code discovery, reuse, and the reliability of LLM-based coding. Yet existing…

Software Engineering · Computer Science 2026-04-20 Jiahui Geng , Qing Li , Fengyu Cai , Fakhri Karray

Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A…

Machine Learning · Computer Science 2024-01-24 Marcelo de Rezende Martins , Marco A. Gerosa

Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from…

Software Engineering · Computer Science 2021-10-19 Mehdi Bahrami , N. C. Shrikanth , Yuji Mizobuchi , Lei Liu , Masahiro Fukuyori , Wei-Peng Chen , Kazuki Munakata

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

Code retrieval is a common practice for programmers to reuse existing code snippets in open-source repositories. Given a user query (i.e., a natural language description), code retrieval aims at searching for the most relevant ones from a…

Software Engineering · Computer Science 2022-03-30 Wenchao Gu , Zongjie Li , Cuiyun Gao , Chaozheng Wang , Hongyu Zhang , Zenglin Xu , Michael R. Lyu

This paper proposes OCR++, an open-source framework designed for a variety of information extraction tasks from scholarly articles including metadata (title, author names, affiliation and e-mail), structure (section headings and body text,…

Utilizing large language models to generate codes has shown promising meaning in software development revolution. Despite the intelligence shown by the large language models, their specificity in code generation can still be improved due to…

Software Engineering · Computer Science 2025-05-20 Kounianhua Du , Jizheng Chen , Renting Rui , Huacan Chai , Lingyue Fu , Wei Xia , Yasheng Wang , Ruiming Tang , Yong Yu , Weinan Zhang

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning…

Software Engineering · Computer Science 2022-02-17 Weisong Sun , Chunrong Fang , Yuchen Chen , Guanhong Tao , Tingxu Han , Quanjun Zhang

Code search is vital in the maintenance and extension of software systems. Past works have used separate language models for the natural language and programming language artifacts on models with multiple encoders and different loss…

Software Engineering · Computer Science 2024-10-31 Monoshiz Mahbub Khan , Zhe Yu

Code embeddings capture the semantic representations of code and are crucial for various code-related large language model (LLM) applications, such as code search. Previous training primarily relies on optimizing the InfoNCE loss by…

Computation and Language · Computer Science 2025-07-18 Zuchen Gao , Zizheng Zhan , Xianming Li , Erxin Yu , Ziqi Zhan , Haotian Zhang , Bin Chen , Yuqun Zhang , Jing Li

Optical character recognition (OCR) is a widely used pattern recognition application in numerous domains. There are several feature-rich, general-purpose OCR solutions available for consumers, which can provide moderate to excellent…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Ayantha Randika , Nilanjan Ray , Xiao Xiao , Allegra Latimer

Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query. Since both tasks aim to model the association between natural…

Information Retrieval · Computer Science 2020-02-26 Wei Ye , Rui Xie , Jinglei Zhang , Tianxiang Hu , Xiaoyin Wang , Shikun Zhang

Despite Retrieval-Augmented Generation improving code completion, traditional retrieval methods struggle with information redundancy and a lack of diversity within limited context windows. To solve this, we propose a resource-optimized…

Software Engineering · Computer Science 2025-10-14 Xiaohan Chen , Zhongying Pan , Quan Feng , Yu Tian , Shuqun Yang , Mengru Wang , Lina Gong , Yuxia Geng , Piji Li , Xiang Chen

Optical Character Recognition (OCR) technology finds applications in digitizing books and unstructured documents, along with applications in other domains such as mobility statistics, law enforcement, traffic, security systems, etc. The…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Aishik Rakshit , Samyak Mehta , Anirban Dasgupta

Recently the retrieval-augmented generation (RAG) has been successfully applied in code generation. However, existing pipelines for retrieval-augmented code generation (RACG) employ static knowledge bases with a single source, limiting the…

Computation and Language · Computer Science 2024-12-04 Hongjin Su , Shuyang Jiang , Yuhang Lai , Haoyuan Wu , Boao Shi , Che Liu , Qian Liu , Tao Yu

Pretrained language models have shown strong effectiveness in code-related tasks, such as code retrieval, code generation, code summarization, and code completion tasks. In this paper, we propose COde assistaNt viA retrieval-augmeNted…

Computation and Language · Computer Science 2024-11-05 Xinze Li , Hanbin Wang , Zhenghao Liu , Shi Yu , Shuo Wang , Yukun Yan , Yukai Fu , Yu Gu , Ge Yu

Semantic code search is the task of retrieving relevant code snippet given a natural language query. Different from typical information retrieval tasks, code search requires to bridge the semantic gap between the programming language and…

Computation and Language · Computer Science 2022-01-28 Chen Wu , Ming Yan
‹ Prev 1 2 3 10 Next ›