Related papers: Universal Representation for Code

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and…

Machine Learning · Computer Science 2020-07-03 Jiezhong Qiu , Qibin Chen , Yuxiao Dong , Jing Zhang , Hongxia Yang , Ming Ding , Kuansan Wang , Jie Tang

Robust Graph Representation Learning via Predictive Coding

Predictive coding is a message-passing framework initially developed to model information processing in the brain, and now also topic of research in machine learning due to some interesting properties. One of such properties is the natural…

Machine Learning · Computer Science 2022-12-12 Billy Byiringiro , Tommaso Salvatori , Thomas Lukasiewicz

Accelerating Materials Discovery: Learning a Universal Representation of Chemical Processes for Cross-Domain Property Prediction

Experimental validation of chemical processes is slow and costly, limiting exploration in materials discovery. Machine learning can prioritize promising candidates, but existing data in patents and literature is heterogeneous and difficult…

Chemical Physics · Physics 2025-12-09 Mikhail Tsitsvero , Atsuyuki Nakao , Hisaki Ikebata

Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning

The progress made in code modeling has been tremendous in recent years thanks to the design of natural language processing learning approaches based on state-of-the-art model architectures. Nevertheless, we believe that the current…

Software Engineering · Computer Science 2022-02-22 Martin Weyssow , Houari Sahraoui , Bang Liu

Learning Program Semantics with Code Representations: An Empirical Study

Program semantics learning is the core and fundamental for various code intelligent tasks e.g., vulnerability detection, clone detection. A considerable amount of existing works propose diverse approaches to learn the program semantics for…

Software Engineering · Computer Science 2022-03-23 Jing Kai Siow , Shangqing Liu , Xiaofei Xie , Guozhu Meng , Yang Liu

Pre-Training Graph Neural Networks for Generic Structural Feature Extraction

Graph neural networks (GNNs) are shown to be successful in modeling applications with graph structures. However, training an accurate GNN model requires a large collection of labeled data and expressive features, which might be inaccessible…

Machine Learning · Computer Science 2019-06-03 Ziniu Hu , Changjun Fan , Ting Chen , Kai-Wei Chang , Yizhou Sun

A General Path-Based Representation for Predicting Program Properties

Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to…

Programming Languages · Computer Science 2018-04-24 Uri Alon , Meital Zilberstein , Omer Levy , Eran Yahav

Commit2Vec: Learning Distributed Representations of Code Changes

Deep learning methods, which have found successful applications in fields like image classification and natural language processing, have recently been applied to source code analysis too, due to the enormous amount of freely available…

Software Engineering · Computer Science 2021-11-18 Rocìo Cabrera Lozoya , Arnaud Baumann , Antonino Sabetta , Michele Bezzi

Learning to Represent Programs with Graphs

Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For…

Machine Learning · Computer Science 2018-05-08 Miltiadis Allamanis , Marc Brockschmidt , Mahmoud Khademi

A Comprehensive Analytical Survey on Unsupervised and Semi-Supervised Graph Representation Learning Methods

Graph representation learning is a fast-growing field where one of the main objectives is to generate meaningful representations of graphs in lower-dimensional spaces. The learned embeddings have been successfully applied to perform various…

Machine Learning · Computer Science 2021-12-21 Md. Khaledur Rahman , Ariful Azad

Self-Supervised Graph Representation Learning via Global Context Prediction

To take full advantage of fast-growing unlabeled networked data, this paper introduces a novel self-supervised strategy for graph representation learning by exploiting natural supervision provided by the data itself. Inspired by human…

Machine Learning · Computer Science 2025-11-20 Zhen Peng , Yixiang Dong , Minnan Luo , Xiao-Ming Wu , Qinghua Zheng

Graph Representation Ensemble Learning

Representation learning on graphs has been gaining attention due to its wide applicability in predicting missing links, and classifying and recommending nodes. Most embedding methods aim to preserve certain properties of the original graph…

Social and Information Networks · Computer Science 2019-09-13 Palash Goyal , Di Huang , Sujit Rokka Chhetri , Arquimedes Canedo , Jaya Shree , Evan Patterson

Code Representation Learning At Scale

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

GraphCodeBERT: Pre-training Code Representations with Data Flow

Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code…

Software Engineering · Computer Science 2021-09-14 Daya Guo , Shuo Ren , Shuai Lu , Zhangyin Feng , Duyu Tang , Shujie Liu , Long Zhou , Nan Duan , Alexey Svyatkovskiy , Shengyu Fu , Michele Tufano , Shao Kun Deng , Colin Clement , Dawn Drain , Neel Sundaresan , Jian Yin , Daxin Jiang , Ming Zhou

Unsupervised Learning of General-Purpose Embeddings for Code Changes

Applying machine learning to tasks that operate with code changes requires their numerical representation. In this work, we propose an approach for obtaining such representations during pre-training and evaluate them on two different…

Software Engineering · Computer Science 2021-07-12 Mikhail Pravilov , Egor Bogomolov , Yaroslav Golubev , Timofey Bryksin

Learning More Universal Representations for Transfer-Learning

A representation is supposed universal if it encodes any element of the visual world (e.g., objects, scenes) in any configuration (e.g., scale, context). While not expecting pure universal representations, the goal in the literature is to…

Computer Vision and Pattern Recognition · Computer Science 2018-09-05 Youssef Tamaazousti , Hervé Le Borgne , Céline Hudelot , Mohamed El Amine Seddik , Mohamed Tamaazousti

Strategies for Pre-training Graph Neural Networks

Many applications of machine learning require a model to make accurate pre-dictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to…

Machine Learning · Computer Science 2020-02-20 Weihua Hu , Bowen Liu , Joseph Gomes , Marinka Zitnik , Percy Liang , Vijay Pande , Jure Leskovec

Investigating Representation Universality: Case Study on Genealogical Representations

Motivated by interpretability and reliability, we investigate whether large language models (LLMs) deploy universal geometric structures to encode discrete, graph-structured knowledge. To this end, we present two complementary experimental…

Machine Learning · Computer Science 2025-11-25 David D. Baek , Yuxiao Li , Max Tegmark

Learning to Make Predictions on Graphs with Autoencoders

We examine two fundamental tasks associated with graph representation learning: link prediction and semi-supervised node classification. We present a novel autoencoder architecture capable of learning a joint representation of both local…

Machine Learning · Computer Science 2019-03-12 Phi Vu Tran

Always be Pre-Training: Representation Learning for Network Intrusion Detection with GNNs

Graph neural network-based network intrusion detection systems have recently demonstrated state-of-the-art performance on benchmark datasets. Nevertheless, these methods suffer from a reliance on target encoding for data pre-processing,…

Cryptography and Security · Computer Science 2024-03-01 Zhengyao Gu , Diego Troy Lopez , Lilas Alrahis , Ozgur Sinanoglu