Related papers: Exploring Representation-Level Augmentation for Co…

Unsupervised Document Embedding via Contrastive Augmentation

We present a contrasting learning approach with data augmentation techniques to learn document representations in an unsupervised manner. Inspired by recent contrastive self-supervised learning algorithms used for image and NLP pretraining,…

Computation and Language · Computer Science 2021-03-29 Dongsheng Luo , Wei Cheng , Jingchao Ni , Wenchao Yu , Xuchao Zhang , Bo Zong , Yanchi Liu , Zhengzhang Chen , Dongjin Song , Haifeng Chen , Xiang Zhang

Generation-Augmented Query Expansion For Code Retrieval

Pre-trained language models have achieved promising success in code retrieval tasks, where a natural language documentation query is given to find the most relevant existing code snippet. However, existing models focus only on optimizing…

Software Engineering · Computer Science 2022-12-22 Dong Li , Yelong Shen , Ruoming Jin , Yi Mao , Kuan Wang , Weizhu Chen

AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models

Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from…

Software Engineering · Computer Science 2021-10-19 Mehdi Bahrami , N. C. Shrikanth , Yuji Mizobuchi , Lei Liu , Masahiro Fukuyori , Wei-Peng Chen , Kazuki Munakata

CoCoSoDa: Effective Contrastive Learning for Code Search

Code search aims to retrieve semantically relevant code snippets for a given natural language query. Recently, many approaches employing contrastive learning have shown promising results on code representation learning and greatly improved…

Software Engineering · Computer Science 2023-02-14 Ensheng Shi , Yanlin Wang , Wenchao Gu , Lun Du , Hongyu Zhang , Shi Han , Dongmei Zhang , Hongbin Sun

Code Representation Learning At Scale

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred…

Computation and Language · Computer Science 2024-02-06 Dejiao Zhang , Wasi Ahmad , Ming Tan , Hantian Ding , Ramesh Nallapati , Dan Roth , Xiaofei Ma , Bing Xiang

Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views

A data augmentation module is utilized in contrastive learning to transform the given data example into two views, which is considered essential and irreplaceable. However, the predetermined composition of multiple data augmentations brings…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Junbo Zhang , Kaisheng Ma

Boosting Source Code Learning with Text-Oriented Data Augmentation: An Empirical Study

Recent studies have demonstrated remarkable advancements in source code learning, which applies deep neural networks (DNNs) to tackle various software engineering tasks. Similar to other DNN-based domains, source code learning also requires…

Software Engineering · Computer Science 2025-02-07 Zeming Dong , Qiang Hu , Yuejun Guo , Zhenya Zhang , Maxime Cordy , Mike Papadakis , Yves Le Traon , Jianjun Zhao

Improving Contrastive Learning with Model Augmentation

The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences. Due to the data sparsity and noise issues in sequences, a new self-supervised learning…

Machine Learning · Computer Science 2022-03-30 Zhiwei Liu , Yongjun Chen , Jia Li , Man Luo , Philip S. Yu , Caiming Xiong

Graph Contrastive Learning with Adaptive Augmentation

Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the…

Machine Learning · Computer Science 2021-03-01 Yanqiao Zhu , Yichen Xu , Feng Yu , Qiang Liu , Shu Wu , Liang Wang

You Augment Me: Exploring ChatGPT-based Data Augmentation for Semantic Code Search

Code search plays a crucial role in software development, enabling developers to retrieve and reuse code using natural language queries. While the performance of code search models improves with an increase in high-quality data, obtaining…

Software Engineering · Computer Science 2024-08-20 Yanlin Wang , Lianghong Guo , Ensheng Shi , Wenqing Chen , Jiachi Chen , Wanjun Zhong , Menghan Wang , Hui Li , Hongyu Zhang , Ziyu Lyu , Zibin Zheng

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Data augmentation has been demonstrated as an effective strategy for improving model generalization and data efficiency. However, due to the discrete nature of natural language, designing label-preserving transformations for text data tends…

Computation and Language · Computer Science 2020-10-20 Yanru Qu , Dinghan Shen , Yelong Shen , Sandra Sajeev , Jiawei Han , Weizhu Chen

Composable Augmentation Encoding for Video Representation Learning

We focus on contrastive methods for self-supervised video representation learning. A common paradigm in contrastive learning is to construct positive pairs by sampling different data views for the same instance, with different data…

Computer Vision and Pattern Recognition · Computer Science 2021-08-23 Chen Sun , Arsha Nagrani , Yonglong Tian , Cordelia Schmid

CodeRetriever: Unimodal and Bimodal Contrastive Learning for Code Search

In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. We adopt two contrastive learning schemes in CodeRetriever: unimodal…

Computation and Language · Computer Science 2022-10-27 Xiaonan Li , Yeyun Gong , Yelong Shen , Xipeng Qiu , Hang Zhang , Bolun Yao , Weizhen Qi , Daxin Jiang , Weizhu Chen , Nan Duan

Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation

Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus…

Computation and Language · Computer Science 2024-10-22 Esteban Garces Arias , Julian Rodemann , Meimingwei Li , Christian Heumann , Matthias Aßenmacher

Exploring Data Augmentation for Code Generation Tasks

Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too. Previous research primarily explored code pre-training and…

Computation and Language · Computer Science 2023-02-08 Pinzhen Chen , Gerasimos Lampouras

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

Contrastive learning enables learning useful audio and speech representations without ground-truth labels by maximizing the similarity between latent representations of similar signal segments. In this framework various data augmentation…

Audio and Speech Processing · Electrical Eng. & Systems 2022-04-11 Salah Zaiem , Titouan Parcollet , Slim Essid

MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation

Inspired by the great success of Deep Neural Networks (DNNs) in natural language processing (NLP), DNNs have been increasingly applied in source code analysis and attracted significant attention from the software engineering community. Due…

Software Engineering · Computer Science 2023-01-11 Zeming Dong , Qiang Hu , Yuejun Guo , Maxime Cordy , Mike Papadakis , Zhenya Zhang , Yves Le Traon , Jianjun Zhao

Contrastive Learning with Stronger Augmentations

Representation learning has significantly been developed with the advance of contrastive learning methods. Most of those methods have benefited from various data augmentations that are carefully designated to maintain their identities so…

Computer Vision and Pattern Recognition · Computer Science 2022-01-24 Xiao Wang , Guo-Jun Qi

CoViews: Adaptive Augmentation Using Cooperative Views for Enhanced Contrastive Learning

Data augmentation plays a critical role in generating high-quality positive and negative pairs necessary for effective contrastive learning. However, common practices involve using a single augmentation policy repeatedly to generate…

Computer Vision and Pattern Recognition · Computer Science 2024-05-14 Nazim Bendib

Source Code Data Augmentation for Deep Learning: A Survey

The increasingly popular adoption of deep learning models in many critical source code tasks motivates the development of data augmentation (DA) techniques to enhance training data and improve various capabilities (e.g., robustness and…

Computation and Language · Computer Science 2023-11-14 Terry Yue Zhuo , Zhou Yang , Zhensu Sun , Yufei Wang , Li Li , Xiaoning Du , Zhenchang Xing , David Lo