Related papers: CODE-MVP: Learning to Represent Source Code from M…

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. Recently, many pre-trained language models for…

Computation and Language · Computer Science 2021-09-10 Xin Wang , Yasheng Wang , Fei Mi , Pingyi Zhou , Yao Wan , Xiao Liu , Li Li , Hao Wu , Jin Liu , Xin Jiang

Multi-View Pre-Trained Model for Code Vulnerability Identification

Vulnerability identification is crucial for cyber security in the software-related industry. Early identification methods require significant manual efforts in crafting features or annotating vulnerable code. Although the recent pre-trained…

Software Engineering · Computer Science 2022-08-11 Xuxiang Jiang , Yinhao Xiao , Jun Wang , Wei Zhang

Contrastive Code Representation Learning

Recent work learns contextual representations of source code by reconstructing tokens from their context. For downstream semantic understanding tasks like summarizing code in English, these representations should ideally capture program…

Machine Learning · Computer Science 2022-01-10 Paras Jain , Ajay Jain , Tianjun Zhang , Pieter Abbeel , Joseph E. Gonzalez , Ion Stoica

Information Theory-Guided Heuristic Progressive Multi-View Coding

Multi-view representation learning aims to capture comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning to different views in a pairwise manner, which is still scalable:…

Computer Vision and Pattern Recognition · Computer Science 2023-08-24 Jiangmeng Li , Hang Gao , Wenwen Qiang , Changwen Zheng

Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used…

Software Engineering · Computer Science 2021-05-25 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

CONCORD: Clone-aware Contrastive Learning for Source Code

Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE…

Software Engineering · Computer Science 2023-06-07 Yangruibo Ding , Saikat Chakraborty , Luca Buratti , Saurabh Pujar , Alessandro Morari , Gail Kaiser , Baishakhi Ray

Information Theory-Guided Heuristic Progressive Multi-View Coding

Multi-view representation learning captures comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning (CL) to learn representations, regarded as a pairwise manner, which is still…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Jiangmeng Li , Wenwen Qiang , Hang Gao , Bing Su , Farid Razzak , Jie Hu , Changwen Zheng , Hui Xiong

Contrastive Multiview Coding

Humans view the world through many sensory channels, e.g., the long-wavelength light channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right ear. Each view is noisy and incomplete, but important…

Computer Vision and Pattern Recognition · Computer Science 2020-12-21 Yonglong Tian , Dilip Krishnan , Phillip Isola

CodeRetriever: Unimodal and Bimodal Contrastive Learning for Code Search

In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. We adopt two contrastive learning schemes in CodeRetriever: unimodal…

Computation and Language · Computer Science 2022-10-27 Xiaonan Li , Yeyun Gong , Yelong Shen , Xipeng Qiu , Hang Zhang , Bolun Yao , Weizhen Qi , Daxin Jiang , Weizhu Chen , Nan Duan

Multimodal Contrastive Training for Visual Representation Learning

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing visual pre-training methods, which solve a proxy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , Yilin Wang , Michael Maire , Ajinkya Kale , Baldo Faieta

Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning

Although Large Language Models (LLMs) excel in reasoning and generation for language tasks, they are not specifically designed for multimodal challenges. Training Multimodal Large Language Models (MLLMs), however, is resource-intensive and…

Computer Vision and Pattern Recognition · Computer Science 2025-02-18 Yuqi Pang , Bowen Yang , Haoqin Tu , Yun Cao , Zeyu Zhang

CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models

Recent years have witnessed a significant increase in the performance of Vision and Language tasks. Foundational Vision-Language Models (VLMs), such as CLIP, have been leveraged in multiple settings and demonstrated remarkable performance…

Computer Vision and Pattern Recognition · Computer Science 2024-03-04 Santiago Castro , Amir Ziai , Avneesh Saluja , Zhuoning Yuan , Rada Mihalcea

Supervised Contrastive Learning for Product Matching

Contrastive learning has moved the state of the art for many tasks in computer vision and information retrieval in recent years. This poster is the first work that applies supervised contrastive learning to the task of product matching in…

Machine Learning · Computer Science 2022-05-03 Ralph Peeters , Christian Bizer

Language-Agnostic Representation Learning of Source Code from Structure and Context

Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program. Traditionally, designers of machine learning models have relied predominantly either on Structure…

Machine Learning · Computer Science 2021-03-23 Daniel Zügner , Tobias Kirschstein , Michele Catasta , Jure Leskovec , Stephan Günnemann

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural…

Machine Learning · Computer Science 2022-02-28 Ting Long , Yutong Xie , Xianyu Chen , Weinan Zhang , Qinxiang Cao , Yong Yu

Contrastive Visual-Linguistic Pretraining

Several multi-modality representation learning approaches such as LXMERT and ViLBERT have been proposed recently. Such approaches can achieve superior performance due to the high-level semantic information captured during large-scale…

Computer Vision and Pattern Recognition · Computer Science 2020-07-28 Lei Shi , Kai Shuang , Shijie Geng , Peng Su , Zhengkai Jiang , Peng Gao , Zuohui Fu , Gerard de Melo , Sen Su

Learning Generalizable Multimodal Representations for Software Vulnerability Detection

Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture developer intent. However, existing vulnerability detection methods mostly rely on…

Software Engineering · Computer Science 2026-05-01 Zeming Dong , Yuejun Guo , Qiang Hu , Yao Zhang , Maxime Cordy , Hao Liu , Mike Papadakis , Yongqiang Lyu

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training

Large-scale multi-modal contrastive pre-training has demonstrated great utility to learn transferable features for a range of downstream tasks by mapping multiple modalities into a shared embedding space. Typically, this has employed…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 Haoxuan You , Luowei Zhou , Bin Xiao , Noel Codella , Yu Cheng , Ruochen Xu , Shih-Fu Chang , Lu Yuan

ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training

Recent Vision-Language Pre-trained (VLP) models based on dual encoder have attracted extensive attention from academia and industry due to their superior performance on various cross-modal tasks and high computational efficiency. They…

Computer Vision and Pattern Recognition · Computer Science 2022-10-03 Bin Shan , Weichong Yin , Yu Sun , Hao Tian , Hua Wu , Haifeng Wang

Contrastive Vision-Language Pre-training with Limited Resources

Pioneering dual-encoder pre-training works (e.g., CLIP and ALIGN) have revealed the potential of aligning multi-modal representations with contrastive learning. However, these works require a tremendous amount of data and computational…

Computer Vision and Pattern Recognition · Computer Science 2022-07-19 Quan Cui , Boyan Zhou , Yu Guo , Weidong Yin , Hao Wu , Osamu Yoshie , Yubo Chen