Related papers: deGraphCS: Embedding Variable-based Flow Graph for…

Deep Graph Matching and Searching for Semantic Code Retrieval

Code retrieval is to find the code snippet from a large corpus of source code repositories that highly matches the query of natural language description. Recent work mainly uses natural language processing techniques to process both query…

Artificial Intelligence · Computer Science 2021-06-23 Xiang Ling , Lingfei Wu , Saizhuo Wang , Gaoning Pan , Tengfei Ma , Fangli Xu , Alex X. Liu , Chunming Wu , Shouling Ji

GraphSearchNet: Enhancing GNNs via Capturing Global Dependencies for Semantic Code Search

Code search aims to retrieve accurate code snippets based on a natural language query to improve software productivity and quality. With the massive amount of available programs such as (on GitHub or Stack Overflow), identifying and…

Software Engineering · Computer Science 2023-02-14 Shangqing Liu , Xiaofei Xie , Jingkai Siow , Lei Ma , Guozhu Meng , Yang Liu

Learning Deep Semantic Model for Code Search using CodeSearchNet Corpus

Semantic code search is the task of retrieving relevant code snippet given a natural language query. Different from typical information retrieval tasks, code search requires to bridge the semantic gap between the programming language and…

Computation and Language · Computer Science 2022-01-28 Chen Wu , Ming Yan

Code Search based on Context-aware Code Translation

Code search is a widely used technique by developers during software development. It provides semantically similar implementations from a large code corpus to developers based on their queries. Existing techniques leverage deep learning…

Software Engineering · Computer Science 2022-02-17 Weisong Sun , Chunrong Fang , Yuchen Chen , Guanhong Tao , Tingxu Han , Quanjun Zhang

PSCS: A Path-based Neural Model for Semantic Code Search

To obtain code snippets for reuse, programmers prefer to search for related documents, e.g., blogs or Q&A, instead of code itself. The major reason is due to the semantic diversity and mismatch between queries and code snippets. Deep…

Software Engineering · Computer Science 2020-08-18 Zhensu Sun , Yan Liu , Chen Yang , Yu Qian

CONCORD: Towards a DSL for Configurable Graph Code Representation

Deep learning is widely used to uncover hidden patterns in large code corpora. To achieve this, constructing a format that captures the relevant characteristics and features of source code is essential. Graph-based representations have…

Software Engineering · Computer Science 2024-02-01 Mootez Saad , Tushar Sharma

DeepVS: An Efficient and Generic Approach for Source Code Modeling Usage

The source code suggestions provided by current IDEs are mostly dependent on static type learning. These suggestions often end up proposing irrelevant suggestions for a peculiar context. Recently, deep learning-based approaches have shown…

Neural and Evolutionary Computing · Computer Science 2020-07-15 Yasir Hussain , Zhiqiu Huang , Yu Zhou , Senzhang Wang

TreeCaps: Tree-Based Capsule Networks for Source Code Processing

Recently program learning techniques have been proposed to process source code based on syntactical structures (e.g., Abstract Syntax Trees) and/or semantic information (e.g., Dependency Graphs). Although graphs may be better at capturing…

Software Engineering · Computer Science 2020-12-15 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

Making Fast Graph-based Algorithms with Graph Metric Embeddings

The computation of distance measures between nodes in graphs is inefficient and does not scale to large graphs. We explore dense vector representations as an effective way to approximate the same information: we introduce a simple yet…

Computation and Language · Computer Science 2019-06-18 Andrey Kutuzov , Mohammad Dorgham , Oleksiy Oliynyk , Chris Biemann , Alexander Panchenko

GraphNAS: Graph Neural Architecture Search with Reinforcement Learning

Graph Neural Networks (GNNs) have been popularly used for analyzing non-Euclidean data such as social network data and biological data. Despite their success, the design of graph neural networks requires a lot of manual work and domain…

Machine Learning · Computer Science 2020-11-03 Yang Gao , Hong Yang , Peng Zhang , Chuan Zhou , Yue Hu

GypSum: Learning Hybrid Representations for Code Summarization

Code summarization with deep learning has been widely studied in recent years. Current deep learning models for code summarization generally follow the principle in neural machine translation and adopt the encoder-decoder framework, where…

Software Engineering · Computer Science 2022-04-28 Yu Wang , Yu Dong , Xuesong Lu , Aoying Zhou

Survey of Code Search Based on Deep Learning

Code writing is repetitive and predictable, inspiring us to develop various code intelligence techniques. This survey focuses on code search, that is, to retrieve code that matches a given query by effectively capturing the semantic…

Software Engineering · Computer Science 2023-12-14 Yutao Xie , Jiayi Lin , Hande Dong , Lei Zhang , Zhonghai Wu

A Comprehensive Survey on Deep Graph Representation Learning

Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine…

Machine Learning · Computer Science 2024-02-29 Wei Ju , Zheng Fang , Yiyang Gu , Zequn Liu , Qingqing Long , Ziyue Qiao , Yifang Qin , Jianhao Shen , Fang Sun , Zhiping Xiao , Junwei Yang , Jingyang Yuan , Yusheng Zhao , Yifan Wang , Xiao Luo , Ming Zhang

Deep Code Search with Naming-Agnostic Contrastive Multi-View Learning

Software development is a repetitive task, as developers usually reuse or get inspiration from existing implementations. Code search, which refers to the retrieval of relevant code snippets from a codebase according to the developer's…

Information Retrieval · Computer Science 2025-08-12 Jiadong Feng , Wei Li , Suhuang Wu , Zhao Wei , Yong Xu , Juhong Wang , Hui Li

Learning to Represent Programs with Graphs

Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For…

Machine Learning · Computer Science 2018-05-08 Miltiadis Allamanis , Marc Brockschmidt , Mahmoud Khademi

Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning

This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real time in a large distributed system. To maximize system throughput, the resource allocation strategy that…

Machine Learning · Computer Science 2019-11-21 Xiang Ni , Jing Li , Mo Yu , Wang Zhou , Kun-Lung Wu

Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction

Existing defects in software components is unavoidable and leads to not only a waste of time and money but also many serious consequences. To build predictive models, previous studies focus on manually extracting features or using tree…

Software Engineering · Computer Science 2018-02-15 Anh Viet Phan , Minh Le Nguyen , Lam Thu Bui

CodeGraph: Enhancing Graph Reasoning of LLMs with Code

With the increasing popularity of large language models (LLMs), reasoning on basic graph algorithm problems is an essential intermediate step in assessing their abilities to process and infer complex graph reasoning tasks. Existing methods…

Computation and Language · Computer Science 2024-08-27 Qiaolong Cai , Zhaowei Wang , Shizhe Diao , James Kwok , Yangqiu Song

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural…

Machine Learning · Computer Science 2021-09-07 Weilin Cong , Rana Forsati , Mahmut Kandemir , Mehrdad Mahdavi

GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization

As opposed to natural languages, source code understanding is influenced by grammatical relationships between tokens regardless of their identifier name. Graph representations of source code such as Abstract Syntax Tree (AST) can capture…

Machine Learning · Computer Science 2021-11-18 Junyan Cheng , Iordanis Fostiropoulos , Barry Boehm