Related papers: Source Code Data Augmentation for Deep Learning: A…

Boosting Source Code Learning with Text-Oriented Data Augmentation: An Empirical Study

Recent studies have demonstrated remarkable advancements in source code learning, which applies deep neural networks (DNNs) to tackle various software engineering tasks. Similar to other DNN-based domains, source code learning also requires…

Software Engineering · Computer Science 2025-02-07 Zeming Dong , Qiang Hu , Yuejun Guo , Zhenya Zhang , Maxime Cordy , Mike Papadakis , Yves Le Traon , Jianjun Zhao

Data Augmentation Approaches in Natural Language Processing: A Survey

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in…

Computation and Language · Computer Science 2022-06-28 Bohan Li , Yutai Hou , Wanxiang Che

A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability

Data augmentation (DA) is indispensable in modern machine learning and deep neural networks. The basic idea of DA is to construct new training data to improve the model's generalization by adding slightly disturbed versions of existing data…

Machine Learning · Computer Science 2024-06-05 Chengtai Cao , Fan Zhou , Yurou Dai , Jianping Wang , Kunpeng Zhang

Data Augmentation for Sequential Recommendation: A Survey

As an essential branch of recommender systems, sequential recommendation (SR) has received much attention due to its well-consistency with real-world situations. However, the widespread data sparsity issue limits the SR model's performance.…

Information Retrieval · Computer Science 2024-09-23 Yizhou Dang , Enneng Yang , Yuting Liu , Guibing Guo , Linying Jiang , Jianzhe Zhao , Xingwei Wang

A Survey on Data Augmentation in Large Model Era

Large models, encompassing large language and diffusion models, have shown exceptional promise in approximating human-level intelligence, garnering significant interest from both academic and industrial spheres. However, the training of…

Machine Learning · Computer Science 2024-03-05 Yue Zhou , Chenlu Guo , Xu Wang , Yi Chang , Yuan Wu

MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation

Inspired by the great success of Deep Neural Networks (DNNs) in natural language processing (NLP), DNNs have been increasingly applied in source code analysis and attracted significant attention from the software engineering community. Due…

Software Engineering · Computer Science 2023-01-11 Zeming Dong , Qiang Hu , Yuejun Guo , Maxime Cordy , Mike Papadakis , Zhenya Zhang , Yves Le Traon , Jianjun Zhao

Image Data Augmentation Approaches: A Comprehensive Survey and Future directions

Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to…

Computer Vision and Pattern Recognition · Computer Science 2023-03-14 Teerath Kumar , Alessandra Mileo , Rob Brennan , Malika Bendechache

A Comprehensive Survey on Data Augmentation

Data augmentation is a series of techniques that generate high-quality artificial data by manipulating existing data samples. By leveraging data augmentation techniques, AI models can achieve significantly improved applicability in tasks…

Machine Learning · Computer Science 2025-10-16 Zaitian Wang , Pengfei Wang , Kunpeng Liu , Pengyang Wang , Yanjie Fu , Chang-Tien Lu , Charu C. Aggarwal , Jian Pei , Yuanchun Zhou

Image Data Augmentation for Deep Learning: A Survey

Deep learning has achieved remarkable results in many computer vision tasks. Deep neural networks typically rely on large amounts of training data to avoid overfitting. However, labeled data for real-world applications may be limited. By…

Computer Vision and Pattern Recognition · Computer Science 2023-11-07 Suorong Yang , Weikang Xiao , Mengchen Zhang , Suhan Guo , Jian Zhao , Furao Shen

A Survey of Data Augmentation Approaches for NLP

Data augmentation has recently seen increased interest in NLP due to more work in low-resource domains, new tasks, and the popularity of large-scale neural networks that require large amounts of training data. Despite this recent upsurge,…

Computation and Language · Computer Science 2021-12-03 Steven Y. Feng , Varun Gangal , Jason Wei , Sarath Chandar , Soroush Vosoughi , Teruko Mitamura , Eduard Hovy

A Survey on Mixup Augmentations and Beyond

As Deep Neural Networks have achieved thrilling breakthroughs in the past decade, data augmentations have garnered increasing attention as regularization techniques when massive labeled data are unavailable. Among existing augmentations,…

Machine Learning · Computer Science 2025-04-24 Xin Jin , Hongyu Zhu , Siyuan Li , Zedong Wang , Zicheng Liu , Juanxi Tian , Chang Yu , Huafeng Qin , Stan Z. Li

Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey

Deep learning (DL) has become one of the mainstream and effective methods for point cloud analysis tasks such as detection, segmentation and classification. To reduce overfitting during training DL models and improve model performance…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Qinfeng Zhu , Lei Fan , Ningxin Weng

Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges

In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This…

Computation and Language · Computer Science 2024-07-03 Bosheng Ding , Chengwei Qin , Ruochen Zhao , Tianze Luo , Xinze Li , Guizhen Chen , Wenhan Xia , Junjie Hu , Anh Tuan Luu , Shafiq Joty

A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning

Visual reinforcement learning (RL), which makes decisions directly from high-dimensional visual inputs, has demonstrated significant potential in various domains. However, deploying visual RL techniques in the real world remains challenging…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Guozheng Ma , Zhen Wang , Zhecheng Yuan , Xueqian Wang , Bo Yuan , Dacheng Tao

A Survey on Data Augmentation for Text Classification

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Research Trends and Applications of Data Augmentation Algorithms

In the Machine Learning research community, there is a consensus regarding the relationship between model complexity and the required amount of data and computation power. In real world applications, these computational requirements are not…

Machine Learning · Computer Science 2022-08-03 Joao Fonseca , Fernando Bacao

Graph Data Augmentation for Graph Machine Learning: A Survey

Data augmentation has recently seen increased interest in graph machine learning given its demonstrated ability to improve model performance and generalization by added training data. Despite this recent surge, the area is still relatively…

Machine Learning · Computer Science 2023-01-20 Tong Zhao , Wei Jin , Yozen Liu , Yingheng Wang , Gang Liu , Stephan Günnemann , Neil Shah , Meng Jiang

Exploring Data Augmentation for Code Generation Tasks

Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too. Previous research primarily explored code pre-training and…

Computation and Language · Computer Science 2023-02-08 Pinzhen Chen , Gerasimos Lampouras

Empirical Evaluation of Data Augmentations for Biobehavioral Time Series Data with Deep Learning

Deep learning has performed remarkably well on many tasks recently. However, the superior performance of deep models relies heavily on the availability of a large number of training data, which limits the wide adaptation of deep models on…

Machine Learning · Computer Science 2022-10-14 Huiyuan Yang , Han Yu , Akane Sano

Data Augmentation for Conversational AI

Advancements in conversational systems have revolutionized information access, surpassing the limitations of single queries. However, developing dialogue systems requires a large amount of training data, which is a challenge in low-resource…

Computation and Language · Computer Science 2024-03-05 Heydar Soudani , Evangelos Kanoulas , Faegheh Hasibi