English
Related papers

Related papers: Contrastive Code Representation Learning

200 papers

Large-scale pre-trained models such as CodeBERT, GraphCodeBERT have earned widespread attention from both academia and industry. Attributed to the superior ability in code representation, they have been further applied in multiple…

Software Engineering · Computer Science 2023-01-24 Shangqing Liu , Bozhi Wu , Xiaofei Xie , Guozhu Meng , Yang Liu

Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE…

Software Engineering · Computer Science 2023-06-07 Yangruibo Ding , Saikat Chakraborty , Luca Buratti , Saurabh Pujar , Alessandro Morari , Gail Kaiser , Baishakhi Ray

We propose Corder, a self-supervised contrastive learning framework for source code model. Corder is designed to alleviate the need of labeled data for code retrieval and code summarization tasks. The pre-trained model of Corder can be used…

Software Engineering · Computer Science 2021-05-25 Nghi D. Q. Bui , Yijun Yu , Lingxiao Jiang

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. Recently, many pre-trained language models for…

Computation and Language · Computer Science 2021-09-10 Xin Wang , Yasheng Wang , Fei Mi , Pingyi Zhou , Yao Wan , Xiao Liu , Li Li , Hao Wu , Jin Liu , Xin Jiang

Recent years have witnessed increasing interest in code representation learning, which aims to represent the semantics of source code into distributed vectors. Currently, various works have been proposed to represent the complex semantics…

Programming Languages · Computer Science 2022-05-05 Xin Wang , Yasheng Wang , Yao Wan , Jiawei Wang , Pingyi Zhou , Li Li , Hao Wu , Jin Liu

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both…

Pretrained language models such as BERT, GPT have shown great effectiveness in language understanding. The auxiliary predictive tasks in existing pretraining approaches are mostly defined on tokens, thus may not be able to capture…

Computation and Language · Computer Science 2020-06-19 Hongchao Fang , Sicheng Wang , Meng Zhou , Jiayuan Ding , Pengtao Xie

Recently, contrastive learning attracts increasing interests in neural text generation as a new solution to alleviate the exposure bias problem. It introduces a sequence-level training signal which is crucial to generation tasks that always…

Computation and Language · Computer Science 2023-02-06 Chenxin An , Jiangtao Feng , Kai Lv , Lingpeng Kong , Xipeng Qiu , Xuanjing Huang

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing. These models typically corrupt the given sequences with certain types of noise,…

Computation and Language · Computer Science 2020-11-02 Fuli Luo , Pengcheng Yang , Shicheng Li , Xuancheng Ren , Xu Sun

Binary code analysis and comprehension is critical to applications in reverse engineering and computer security tasks where source code is not available. Unfortunately, unlike source code, binary code lacks semantics and is more difficult…

Software Engineering · Computer Science 2025-09-29 Yifan Zhang , Chen Huang , Yueke Zhang , Huajie Shao , Kevin Leach , Yu Huang

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to…

Software Engineering · Computer Science 2022-03-16 Deze Wang , Zhouyang Jia , Shanshan Li , Yue Yu , Yun Xiong , Wei Dong , Xiangke Liao

Code contrastive pre-training has recently achieved significant progress on code-related tasks. In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction…

Computation and Language · Computer Science 2022-10-27 Xiaonan Li , Daya Guo , Yeyun Gong , Yun Lin , Yelong Shen , Xipeng Qiu , Daxin Jiang , Weizhu Chen , Nan Duan

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

Text embeddings are useful features in many applications such as semantic search and computing text similarity. Previous work typically trains models customized for different use cases, varying in dataset choice, training objective and…

Deep learning methods, which have found successful applications in fields like image classification and natural language processing, have recently been applied to source code analysis too, due to the enormous amount of freely available…

Software Engineering · Computer Science 2021-11-18 Rocìo Cabrera Lozoya , Arnaud Baumann , Antonino Sabetta , Michele Bezzi

Contrastive learning has moved the state of the art for many tasks in computer vision and information retrieval in recent years. This poster is the first work that applies supervised contrastive learning to the task of product matching in…

Machine Learning · Computer Science 2022-05-03 Ralph Peeters , Christian Bizer

Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection. However, current methods are still primarily applied to curated datasets like ImageNet. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-12-15 Wouter Van Gansbeke , Simon Vandenhende , Stamatios Georgoulis , Luc Van Gool

Pre-trained models of source code have recently been successfully applied to a wide variety of Software Engineering tasks; they have also seen some practical adoption in practice, e.g. for code completion. Yet, we still know very little…

Software Engineering · Computer Science 2023-12-11 Anjan Karmakar , Romain Robbes

Contrastive learning is a powerful technique to learn representations that are semantically distinctive and geometrically invariant. While most of the earlier approaches have demonstrated its effectiveness on single-modality learning tasks…

Computer Vision and Pattern Recognition · Computer Science 2021-10-19 Anurag Jain , Yashaswi Verma

In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. We adopt two contrastive learning schemes in CodeRetriever: unimodal…

Computation and Language · Computer Science 2022-10-27 Xiaonan Li , Yeyun Gong , Yelong Shen , Xipeng Qiu , Hang Zhang , Bolun Yao , Weizhen Qi , Daxin Jiang , Weizhu Chen , Nan Duan
‹ Prev 1 2 3 10 Next ›