English
Related papers

Related papers: CODE-MVP: Learning to Represent Source Code from M…

200 papers

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. Recently, many pre-trained language models for…

Computation and Language · Computer Science 2021-09-10 Xin Wang , Yasheng Wang , Fei Mi , Pingyi Zhou , Yao Wan , Xiao Liu , Li Li , Hao Wu , Jin Liu , Xin Jiang

Vulnerability identification is crucial for cyber security in the software-related industry. Early identification methods require significant manual efforts in crafting features or annotating vulnerable code. Although the recent pre-trained…

Software Engineering · Computer Science 2022-08-11 Xuxiang Jiang , Yinhao Xiao , Jun Wang , Wei Zhang

Recent work learns contextual representations of source code by reconstructing tokens from their context. For downstream semantic understanding tasks like summarizing code in English, these representations should ideally capture program…

Machine Learning · Computer Science 2022-01-10 Paras Jain , Ajay Jain , Tianjun Zhang , Pieter Abbeel , Joseph E. Gonzalez , Ion Stoica

Multi-view representation learning aims to capture comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning to different views in a pairwise manner, which is still scalable:…

Computer Vision and Pattern Recognition · Computer Science 2023-08-24 Jiangmeng Li , Hang Gao , Wenwen Qiang , Changwen Zheng

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used…

Software Engineering · Computer Science 2021-05-25 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE…

Software Engineering · Computer Science 2023-06-07 Yangruibo Ding , Saikat Chakraborty , Luca Buratti , Saurabh Pujar , Alessandro Morari , Gail Kaiser , Baishakhi Ray

Multi-view representation learning captures comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning (CL) to learn representations, regarded as a pairwise manner, which is still…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Jiangmeng Li , Wenwen Qiang , Hang Gao , Bing Su , Farid Razzak , Jie Hu , Changwen Zheng , Hui Xiong

Humans view the world through many sensory channels, e.g., the long-wavelength light channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right ear. Each view is noisy and incomplete, but important…

Computer Vision and Pattern Recognition · Computer Science 2020-12-21 Yonglong Tian , Dilip Krishnan , Phillip Isola

In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. We adopt two contrastive learning schemes in CodeRetriever: unimodal…

Computation and Language · Computer Science 2022-10-27 Xiaonan Li , Yeyun Gong , Yelong Shen , Xipeng Qiu , Hang Zhang , Bolun Yao , Weizhen Qi , Daxin Jiang , Weizhu Chen , Nan Duan

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing visual pre-training methods, which solve a proxy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , Yilin Wang , Michael Maire , Ajinkya Kale , Baldo Faieta

Although Large Language Models (LLMs) excel in reasoning and generation for language tasks, they are not specifically designed for multimodal challenges. Training Multimodal Large Language Models (MLLMs), however, is resource-intensive and…

Computer Vision and Pattern Recognition · Computer Science 2025-02-18 Yuqi Pang , Bowen Yang , Haoqin Tu , Yun Cao , Zeyu Zhang

Recent years have witnessed a significant increase in the performance of Vision and Language tasks. Foundational Vision-Language Models (VLMs), such as CLIP, have been leveraged in multiple settings and demonstrated remarkable performance…

Computer Vision and Pattern Recognition · Computer Science 2024-03-04 Santiago Castro , Amir Ziai , Avneesh Saluja , Zhuoning Yuan , Rada Mihalcea

Contrastive learning has moved the state of the art for many tasks in computer vision and information retrieval in recent years. This poster is the first work that applies supervised contrastive learning to the task of product matching in…

Machine Learning · Computer Science 2022-05-03 Ralph Peeters , Christian Bizer

Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program. Traditionally, designers of machine learning models have relied predominantly either on Structure…

Machine Learning · Computer Science 2021-03-23 Daniel Zügner , Tobias Kirschstein , Michele Catasta , Jure Leskovec , Stephan Günnemann

Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural…

Machine Learning · Computer Science 2022-02-28 Ting Long , Yutong Xie , Xianyu Chen , Weinan Zhang , Qinxiang Cao , Yong Yu

Several multi-modality representation learning approaches such as LXMERT and ViLBERT have been proposed recently. Such approaches can achieve superior performance due to the high-level semantic information captured during large-scale…

Computer Vision and Pattern Recognition · Computer Science 2020-07-28 Lei Shi , Kai Shuang , Shijie Geng , Peng Su , Zhengkai Jiang , Peng Gao , Zuohui Fu , Gerard de Melo , Sen Su

Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture developer intent. However, existing vulnerability detection methods mostly rely on…

Software Engineering · Computer Science 2026-05-01 Zeming Dong , Yuejun Guo , Qiang Hu , Yao Zhang , Maxime Cordy , Hao Liu , Mike Papadakis , Yongqiang Lyu

Large-scale multi-modal contrastive pre-training has demonstrated great utility to learn transferable features for a range of downstream tasks by mapping multiple modalities into a shared embedding space. Typically, this has employed…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 Haoxuan You , Luowei Zhou , Bin Xiao , Noel Codella , Yu Cheng , Ruochen Xu , Shih-Fu Chang , Lu Yuan

Recent Vision-Language Pre-trained (VLP) models based on dual encoder have attracted extensive attention from academia and industry due to their superior performance on various cross-modal tasks and high computational efficiency. They…

Computer Vision and Pattern Recognition · Computer Science 2022-10-03 Bin Shan , Weichong Yin , Yu Sun , Hao Tian , Hua Wu , Haifeng Wang

Pioneering dual-encoder pre-training works (e.g., CLIP and ALIGN) have revealed the potential of aligning multi-modal representations with contrastive learning. However, these works require a tremendous amount of data and computational…

Computer Vision and Pattern Recognition · Computer Science 2022-07-19 Quan Cui , Boyan Zhou , Yu Guo , Weidong Yin , Hao Wu , Osamu Yoshie , Yubo Chen
‹ Prev 1 2 3 10 Next ›