Related papers: Data Augmentation for Mathematical Objects

Towards Understanding How Data Augmentation Works with Imbalanced Data

Data augmentation forms the cornerstone of many modern machine learning training pipelines; yet, the mechanisms by which it works are not clearly understood. Much of the research on data augmentation (DA) has focused on improving existing…

Machine Learning · Computer Science 2023-04-13 Damien A. Dablain , Nitesh V. Chawla

Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD

Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from…

Symbolic Computation · Computer Science 2024-09-12 Tereso del Río , Matthew England

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks. Recently, researchers proposed a few intensive data augmentation techniques, which indeed…

Machine Learning · Computer Science 2019-11-22 Zhuoxun He , Lingxi Xie , Xin Chen , Ya Zhang , Yanfeng Wang , Qi Tian

Is augmentation effective to improve prediction in imbalanced text datasets?

Imbalanced datasets present a significant challenge for machine learning models, often leading to biased predictions. To address this issue, data augmentation techniques are widely used in natural language processing (NLP) to generate new…

Computation and Language · Computer Science 2023-04-21 Gabriel O. Assunção , Rafael Izbicki , Marcos O. Prates

A Preliminary Study on Data Augmentation of Deep Learning for Image Classification

Deep learning models have a large number of freeparameters that need to be calculated by effective trainingof the models on a great deal of training data to improvetheir generalization performance. However, data obtaining andlabeling is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-01 Benlin Hu , Cheng Lei , Dong Wang , Shu Zhang , Zhenyu Chen

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is…

Computation and Language · Computer Science 2024-10-28 Wataru Hashimoto , Hidetaka Kamigaito , Taro Watanabe

Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving

Math Word Problem (MWP) solving presents a challenging task in Natural Language Processing (NLP). This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems. We…

Computation and Language · Computer Science 2024-05-02 Gulsum Yigit , Mehmet Fatih Amasyali

Semantic-based Data Augmentation for Math Word Problems

It's hard for neural MWP solvers to deal with tiny local variances. In MWP task, some local changes conserve the original semantic while the others may totally change the underlying logic. Currently, existing datasets for MWP task contain…

Computation and Language · Computer Science 2022-04-19 Ailisi Li , Jiaqing Liang , Yanghua Xiao

To Augment or Not to Augment? Diagnosing Distributional Symmetry Breaking

Symmetry-aware methods for machine learning, such as data augmentation and equivariant architectures, encourage correct model behavior on all transformations (e.g. rotations or permutations) of the original dataset. These methods can…

Machine Learning · Computer Science 2026-03-31 Hannah Lawrence , Elyssa Hofgard , Vasco Portilheiro , Yuxuan Chen , Tess Smidt , Robin Walters

Data Augmentation by Pairing Samples for Images Classification

Data augmentation is a widely used technique in many machine learning tasks, such as image classification, to virtually enlarge the training dataset size and avoid overfitting. Traditional data augmentation techniques for image…

Machine Learning · Computer Science 2018-04-12 Hiroshi Inoue

Improved Mixed-Example Data Augmentation

In order to reduce overfitting, neural networks are typically trained with data augmentation, the practice of artificially generating additional training data via label-preserving transformations of existing training examples. While these…

Computer Vision and Pattern Recognition · Computer Science 2019-01-23 Cecilia Summers , Michael J. Dinneen

A Survey on Data Augmentation for Text Classification

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Incorporating Multiple Cluster Centers for Multi-Label Learning

Multi-label learning deals with the problem that each instance is associated with multiple labels simultaneously. Most of the existing approaches aim to improve the performance of multi-label learning by exploiting label correlations.…

Machine Learning · Computer Science 2022-01-19 Senlin Shu , Fengmao Lv , Yan Yan , Li Li , Shuo He , Jun He

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations

We propose a novel data augmentation for labeled sentences called contextual augmentation. We assume an invariance that sentences are natural even if the words in the sentences are replaced with other words with paradigmatic relations. We…

Computation and Language · Computer Science 2018-05-17 Sosuke Kobayashi

Image Data Augmentation for Deep Learning: A Survey

Deep learning has achieved remarkable results in many computer vision tasks. Deep neural networks typically rely on large amounts of training data to avoid overfitting. However, labeled data for real-world applications may be limited. By…

Computer Vision and Pattern Recognition · Computer Science 2023-11-07 Suorong Yang , Weikang Xiao , Mengchen Zhang , Suhan Guo , Jian Zhao , Furao Shen

Automatic Data Augmentation via Invariance-Constrained Learning

Underlying data structures, such as symmetries or invariances to transformations, are often exploited to improve the solution of learning tasks. However, embedding these properties in models or learning algorithms can be challenging and…

Machine Learning · Computer Science 2023-09-19 Ignacio Hounie , Luiz F. O. Chamon , Alejandro Ribeiro

ResizeMix: Mixing Data with Preserved Object Information and True Labels

Data augmentation is a powerful technique to increase the diversity of data, which can effectively improve the generalization ability of neural networks in image recognition tasks. Recent data mixing based augmentation strategies have…

Computer Vision and Pattern Recognition · Computer Science 2020-12-22 Jie Qin , Jiemin Fang , Qian Zhang , Wenyu Liu , Xingang Wang , Xinggang Wang

Diversity-oriented Data Augmentation with Large Language Models

Data augmentation is an essential technique in natural language processing (NLP) for enriching training datasets by generating diverse samples. This process is crucial for improving the robustness and generalization capabilities of NLP…

Computation and Language · Computer Science 2025-10-16 Zaitian Wang , Jinghan Zhang , Xinhao Zhang , Kunpeng Liu , Pengfei Wang , Yuanchun Zhou

The Effectiveness of Data Augmentation in Image Classification using Deep Learning

In this paper, we explore and compare multiple solutions to the problem of data augmentation in image classification. Previous work has demonstrated the effectiveness of data augmentation through simple techniques, such as cropping,…

Computer Vision and Pattern Recognition · Computer Science 2017-12-14 Luis Perez , Jason Wang

Data Augmentation for Manipulation

The success of deep learning depends heavily on the availability of large datasets, but in robotic manipulation there are many learning problems for which such datasets do not exist. Collecting these datasets is time-consuming and…

Robotics · Computer Science 2022-07-21 Peter Mitrano , Dmitry Berenson