Related papers: ARMADA: Attribute-Based Multimodal Data Augmentati…

Learning Multimodal Data Augmentation in Feature Space

The ability to jointly learn from multiple modalities, such as text, audio, and visual data, is a defining feature of intelligent systems. While there have been promising advances in designing neural networks to harness multimodal data, the…

Machine Learning · Computer Science 2023-04-25 Zichang Liu , Zhiqiang Tang , Xingjian Shi , Aston Zhang , Mu Li , Anshumali Shrivastava , Andrew Gordon Wilson

DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling

In this paper, we present an effective data augmentation framework leveraging the Large Language Model (LLM) and Diffusion Model (DM) to tackle the challenges inherent in data-scarce scenarios. Recently, DMs have opened up the possibility…

Computer Vision and Pattern Recognition · Computer Science 2024-09-26 Kyuheon Jung , Yongdeuk Seo , Seongwoo Cho , Jaeyoung Kim , Hyun-seok Min , Sungchul Choi

From Images to Words: Efficient Cross-Modal Knowledge Distillation to Language Models from Black-box Teachers

Knowledge distillation (KD) methods are pivotal in compressing large pre-trained language models into smaller models, ensuring computational efficiency without significantly dropping performance. Traditional KD techniques assume homogeneity…

Computation and Language · Computer Science 2026-03-12 Ayan Sengupta , Shantanu Dixit , Md Shad Akhtar , Tanmoy Chakraborty

ARDA: Automatic Relational Data Augmentation for Machine Learning

Automatic machine learning (\AML) is a family of techniques to automate the process of training predictive models, aiming to both improve performance and make machine learning more accessible. While many recent works have focused on aspects…

Machine Learning · Computer Science 2020-03-24 Nadiia Chepurko , Ryan Marcus , Emanuel Zgraggen , Raul Castro Fernandez , Tim Kraska , David Karger

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models

Recent works have shown that powerful pre-trained language models (PLM) can be fooled by small perturbations or intentional attacks. To solve this issue, various data augmentation techniques are proposed to improve the robustness of PLMs.…

Computation and Language · Computer Science 2021-09-14 Kun Zhou , Wayne Xin Zhao , Sirui Wang , Fuzheng Zhang , Wei Wu , Ji-Rong Wen

Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach

Web-scale visual entity recognition, the task of associating images with their corresponding entities within vast knowledge bases like Wikipedia, presents significant challenges due to the lack of clean, large-scale training data. In this…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Mathilde Caron , Alireza Fathi , Cordelia Schmid , Ahmet Iscen

Contrastive Visual Data Augmentation

Large multimodal models (LMMs) often struggle to recognize novel concepts, as they rely on pre-trained knowledge and have limited ability to capture subtle visual details. Domain-specific knowledge gaps in training also make them prone to…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Yu Zhou , Bingxuan Li , Mohan Tang , Xiaomeng Jin , Te-Lin Wu , Kuan-Hao Huang , Heng Ji , Kai-Wei Chang , Nanyun Peng

Abstract Meaning Representation-Based Logic-Driven Data Augmentation for Logical Reasoning

Combining large language models with logical reasoning enhances their capacity to address problems in a robust and reliable manner. Nevertheless, the intricate nature of logical reasoning poses challenges when gathering reliable data from…

Computation and Language · Computer Science 2025-04-18 Qiming Bao , Alex Yuxuan Peng , Zhenyun Deng , Wanjun Zhong , Gael Gendron , Timothy Pistotti , Neset Tan , Nathan Young , Yang Chen , Yonghua Zhu , Paul Denny , Michael Witbrock , Jiamou Liu

Multimodal Large Language Models for Image, Text, and Speech Data Augmentation: A Survey

In the past five years, research has shifted from traditional Machine Learning (ML) and Deep Learning (DL) approaches to leveraging Large Language Models (LLMs) , including multimodality, for data augmentation to enhance generalization, and…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Ranjan Sapkota , Shaina Raza , Maged Shoman , Achyut Paudel , Manoj Karkee

Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks

Despite large successes of recent language models on diverse tasks, they suffer from severe performance degeneration in low-resource settings with limited training data available. Many existing works tackle this problem by generating…

Computation and Language · Computer Science 2024-02-22 Minju Seo , Jinheon Baek , James Thorne , Sung Ju Hwang

AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering

Medical Multimodal Large Language Models (Med-MLLMs) have shown great promise in medical visual question answering (Med-VQA). However, when deployed in low-resource settings where abundant labeled data are unavailable, existing Med-MLLMs…

Computation and Language · Computer Science 2025-10-06 Ziqing Wang , Chengsheng Mao , Xiaole Wen , Yuan Luo , Kaize Ding

BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification

Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical image classification. Mainstream data augmentation (DA) methods are usually applied at the image level. Due to the specificity and…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Yaoyao Zhu , Xiuding Cai , Xueyao Wang , Xiaoqing Chen , Yu Yao , Zhongliang Fu

Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness

Artificial neural networks typically struggle in generalizing to out-of-context examples. One reason for this limitation is caused by having datasets that incorporate only partial information regarding the potential correlational structure…

Computer Vision and Pattern Recognition · Computer Science 2023-11-20 Valentin Barriere , Felipe del Rio , Andres Carvallo De Ferari , Carlos Aspillaga , Eugenio Herrera-Berg , Cristian Buc Calderon

Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution

Human preference alignment can greatly enhance Multimodal Large Language Models (MLLMs), but collecting high-quality preference data is costly. A promising solution is the self-evolution strategy, where models are iteratively trained on…

Machine Learning · Computer Science 2024-12-23 Wentao Tan , Qiong Cao , Yibing Zhan , Chao Xue , Changxing Ding

RoPDA: Robust Prompt-based Data Augmentation for Low-Resource Named Entity Recognition

Data augmentation has been widely used in low-resource NER tasks to tackle the problem of data sparsity. However, previous data augmentation methods have the disadvantages of disrupted syntactic structures, token-label mismatch, and…

Computation and Language · Computer Science 2023-07-18 Sihan Song , Furao Shen , Jian Zhao

MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification

Due to the lack of quality data for low-resource Bantu languages, significant challenges are presented in text classification and other practical implementations. In this paper, we introduce an advanced model combining Language-Independent…

Computation and Language · Computer Science 2025-02-26 Varun Vashisht , Samar Singh , Mihir Konduskar , Jaskaran Singh Walia , Vukosi Marivate

Balanced Training Data Augmentation for Aspect-Based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) is a crucial fine-grained task in social media scenarios to identify the sentiment polarity of specific aspect terms in a sentence. Although many existing studies leverage large language models (LLMs)…

Computation and Language · Computer Science 2025-07-15 Junjie Liu , Yuanhe Tian , Yan Song

Retrieving Multimodal Information for Augmented Generation: A Survey

As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception…

Computation and Language · Computer Science 2023-12-04 Ruochen Zhao , Hailin Chen , Weishi Wang , Fangkai Jiao , Xuan Long Do , Chengwei Qin , Bosheng Ding , Xiaobao Guo , Minzhi Li , Xingxuan Li , Shafiq Joty

Multimodal RAG Enhanced Visual Description

Textual descriptions for multimodal inputs entail recurrent refinement of queries to produce relevant output images. Despite efforts to address challenges such as scaling model size and data volume, the cost associated with pre-training and…

Machine Learning · Computer Science 2025-08-14 Amit Kumar Jaiswal , Haiming Liu , Ingo Frommholz