Related papers: Structural Code Search using Natural Language Quer…

Frameworks for Querying Databases Using Natural Language: A Literature Review

A Natural Language Interface (NLI) facilitates users to pose queries to retrieve information from a database without using any artificial language such as the Structured Query Language (SQL). Several applications in various domains…

Databases · Computer Science 2019-09-05 Hafsa Shareef Dar , M. Ikramullah Lali , Moin Ul Din , Khalid Mahmood Malik , Syed Ahmad Chan Bukhari

Querying Source Code with Natural Language

One common task of developing or maintaining software is searching the source code for information like specific method calls or write accesses to certain fields. This kind of information is required to correctly implement new features and…

Software Engineering · Computer Science 2016-11-18 Markus Kimmig , Martin Monperrus , Mira Mezini

Query Understanding for Natural Language Enterprise Search

Natural Language Search (NLS) extends the capabilities of search engines that perform keyword search allowing users to issue queries in a more "natural" language. The engine tries to understand the meaning of the queries and to map the…

Machine Learning · Computer Science 2020-12-14 Francisco Borges , Georgios Balikas , Marc Brette , Guillaume Kempf , Arvind Srikantan , Matthieu Landos , Darya Brazouskaya , Qianqian Shi

Program Synthesis using Natural Language

Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific…

Programming Languages · Computer Science 2015-09-02 Aditya Desai , Sumit Gulwani , Vineet Hingorani , Nidhi Jain , Amey Karkare , Mark Marron , Sailesh R , Subhajit Roy

DocCGen: Document-based Controlled Code Generation

Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical…

Software Engineering · Computer Science 2024-07-04 Sameer Pimparkhede , Mehant Kammakomati , Srikanth Tamilselvam , Prince Kumar , Ashok Pon Kumar , Pushpak Bhattacharyya

Line-level Semantic Structure Learning for Code Vulnerability Detection

Unlike the flow structure of natural languages, programming languages have an inherent rigidity in structure and grammar.However, existing detection methods based on pre-trained models typically treat code as a natural language sequence,…

Software Engineering · Computer Science 2024-11-11 Ziliang Wang , Ge Li , Jia Li , Yihong Dong , Yingfei Xiong , Zhi Jin

Semantic Source Code Search: A Study of the Past and a Glimpse at the Future

With the recent explosion in the size and complexity of source codebases and software projects, the need for efficient source code search engines has increased dramatically. Unfortunately, existing information retrieval-based methods fail…

Software Engineering · Computer Science 2021-09-27 Muhammad Khalifa

Neural Program Search: Solving Programming Tasks from Description and Examples

We present a Neural Program Search, an algorithm to generate programs from natural language description and a small number of input/output examples. The algorithm combines methods from Deep Learning and Program Synthesis fields by designing…

Artificial Intelligence · Computer Science 2018-02-14 Illia Polosukhin , Alexander Skidanov

Structure-informed Language Models Are Protein Designers

This paper demonstrates that language models are strong structure-based protein designers. We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs), that have learned massive sequential…

Machine Learning · Computer Science 2023-02-10 Zaixiang Zheng , Yifan Deng , Dongyu Xue , Yi Zhou , Fei YE , Quanquan Gu

LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding

Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and…

Computation and Language · Computer Science 2025-09-26 Yuxuan Hu , Jihao Liu , Ke Wang , Jinliang Zhen , Weikang Shi , Manyuan Zhang , Qi Dou , Rui Liu , Aojun Zhou , Hongsheng Li

CRaDLe: Deep Code Retrieval Based on Semantic Dependency Learning

Code retrieval is a common practice for programmers to reuse existing code snippets in open-source repositories. Given a user query (i.e., a natural language description), code retrieval aims at searching for the most relevant ones from a…

Software Engineering · Computer Science 2022-03-30 Wenchao Gu , Zongjie Li , Cuiyun Gao , Chaozheng Wang , Hongyu Zhang , Zenglin Xu , Michael R. Lyu

Pragmatic approach to structured data querying via natural language interface

As the use of technology increases and data analysis becomes integral in many businesses, the ability to quickly access and interpret data has become more important than ever. Information retrieval technologies are being utilized by…

Computation and Language · Computer Science 2018-07-03 Aliaksei Vertsel , Mikhail Rumiantsau

Structural Language Models of Code

We address the problem of any-code completion - generating a missing piece of source code in a given program without any restriction on the vocabulary or structure. We introduce a new approach to any-code completion that leverages the…

Machine Learning · Computer Science 2020-07-30 Uri Alon , Roy Sadaka , Omer Levy , Eran Yahav

Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs. SQL for No-Code Access to Relational Databases

Large Language Models (LLMs) have spurred progress in text-to-SQL, the task of generating SQL queries from natural language questions based on a given database schema. Despite the declarative nature of SQL, it continues to be a complex…

Computation and Language · Computer Science 2023-12-25 Ben Eyal , Amir Bachar , Ophir Haroche , Michael Elhadad

PSCS: A Path-based Neural Model for Semantic Code Search

To obtain code snippets for reuse, programmers prefer to search for related documents, e.g., blogs or Q&A, instead of code itself. The major reason is due to the semantic diversity and mismatch between queries and code snippets. Deep…

Software Engineering · Computer Science 2020-08-18 Zhensu Sun , Yan Liu , Chen Yang , Yu Qian

Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs

Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…

Software Engineering · Computer Science 2026-05-22 Wei Ma , Zhihao Lin , Shangqing Liu , Qiang Hu , Ye Liu , Wenhan Wang , Cen Zhang , Liming Nie , Li Li , Yang Liu , Lingxiao Jiang

CoNCRA: A Convolutional Neural Network Code Retrieval Approach

Software developers routinely search for code using general-purpose search engines. However, these search engines cannot find code semantically unless it has an accompanying description. We propose a technique for semantic code search: A…

Machine Learning · Computer Science 2024-01-24 Marcelo de Rezende Martins , Marco A. Gerosa

Planning In Natural Language Improves LLM Search For Code Generation

While scaling training compute has led to remarkable improvements in large language models (LLMs), scaling inference compute has not yet yielded analogous gains. We hypothesize that a core missing component is a lack of diverse LLM outputs,…

Machine Learning · Computer Science 2024-10-22 Evan Wang , Federico Cassano , Catherine Wu , Yunfeng Bai , Will Song , Vaskar Nath , Ziwen Han , Sean Hendryx , Summer Yue , Hugh Zhang

Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning

Selecting the right knowledge is critical when using large language models (LLMs) to solve domain-specific data analysis tasks. However, most retrieval-augmented approaches rely primarily on lexical or embedding similarity, which is often a…

Computation and Language · Computer Science 2026-04-28 Xinyi Huang

A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages

Large Language Models (LLMs) have shown impressive capabilities in code generation for popular programming languages. However, their performance on Low-Resource Programming Languages (LRPLs) and Domain-Specific Languages (DSLs) remains a…

Software Engineering · Computer Science 2025-09-29 Sathvik Joel , Jie JW Wu , Fatemeh H. Fard