Related papers: Compositional Generalization for Multi-label Text …

Compositional Generalization for Data-to-Text Generation

Data-to-text generation involves transforming structured data, often represented as predicate-argument tuples, into coherent textual descriptions. Despite recent advances, systems still struggle when confronted with unseen combinations of…

Computation and Language · Computer Science 2023-12-06 Xinnuo Xu , Ivan Titov , Mirella Lapata

Improving Compositional Generalization in Math Word Problem Solving

Compositional generalization refers to a model's capability to generalize to newly composed input data based on the data components observed during training. It has triggered a series of compositional generalization analysis on different…

Computation and Language · Computer Science 2022-09-07 Yunshi Lan , Lei Wang , Jing Jiang , Ee-Peng Lim

Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text…

Computation and Language · Computer Science 2024-06-04 Tianqi Zhong , Zhaoyi Li , Quan Wang , Linqi Song , Ying Wei , Defu Lian , Zhendong Mao

GILE: A Generalized Input-Label Embedding for Text Classification

Neural text classification models typically treat output labels as categorical variables which lack description and semantics. This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to…

Computation and Language · Computer Science 2019-01-31 Nikolaos Pappas , James Henderson

Improving Compositional Generalization in Classification Tasks via Structure Annotations

Compositional generalization is the ability to generalize systematically to a new data distribution by combining known components. Although humans seem to have a great ability to generalize compositionally, state-of-the-art neural models…

Machine Learning · Computer Science 2021-06-22 Juyong Kim , Pradeep Ravikumar , Joshua Ainslie , Santiago Ontañón

Consistent Text Categorization using Data Augmentation in e-Commerce

The categorization of massive e-Commerce data is a crucial, well-studied task, which is prevalent in industrial settings. In this work, we aim to improve an existing product categorization model that is already in use by a major web…

Machine Learning · Computer Science 2023-05-31 Guy Horowitz , Stav Yanovsky Daye , Noa Avigdor-Elgrabli , Ariel Raviv

In real-world applications, as data availability increases, obtaining labeled data for machine learning (ML) projects remains challenging due to the high costs and intensive efforts required for data annotation. Many ML projects,…

Machine Learning · Computer Science 2024-12-24 Ismail Hakki Karaman , Gulser Koksal , Levent Eriskin , Salih Salihoglu

Improving Compositional Generalization with Latent Structure and Data Augmentation

Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to…

Computation and Language · Computer Science 2022-05-06 Linlu Qiu , Peter Shaw , Panupong Pasupat , Paweł Krzysztof Nowak , Tal Linzen , Fei Sha , Kristina Toutanova

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data

State-of-the-art machine learning methods exhibit limited compositional generalization. At the same time, there is a lack of realistic benchmarks that comprehensively measure this ability, which makes it challenging to find and evaluate…

Machine Learning · Computer Science 2020-06-26 Daniel Keysers , Nathanael Schärli , Nathan Scales , Hylke Buisman , Daniel Furrer , Sergii Kashubin , Nikola Momchev , Danila Sinopalnikov , Lukasz Stafiniak , Tibor Tihon , Dmitry Tsarkov , Xiao Wang , Marc van Zee , Olivier Bousquet

SGM: Sequence Generation Model for Multi-label Classification

Multi-label classification is an important yet challenging task in natural language processing. It is more complex than single-label classification in that the labels tend to be correlated. Existing methods tend to ignore the correlations…

Computation and Language · Computer Science 2018-06-18 Pengcheng Yang , Xu Sun , Wei Li , Shuming Ma , Wei Wu , Houfeng Wang

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

A Survey on Data Augmentation for Text Classification

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Retrieval-augmented Multi-label Text Classification

Multi-label text classification (MLC) is a challenging task in settings of large label sets, where label support follows a Zipfian distribution. In this paper, we address this problem through retrieval augmentation, aiming to improve the…

Computation and Language · Computer Science 2023-05-23 Ilias Chalkidis , Yova Kementchedjhieva

Toward Robustness in Multi-label Classification: A Data Augmentation Strategy against Imbalance and Noise

Multi-label classification poses challenges due to imbalanced and noisy labels in training data. We propose a unified data augmentation method, named BalanceMix, to address these challenges. Our approach includes two samplers for imbalanced…

Machine Learning · Computer Science 2023-12-13 Hwanjun Song , Minseok Kim , Jae-Gil Lee

Compositional Generalization by Learning Analytical Expressions

Compositional generalization is a basic and essential intellective capability of human beings, which allows us to recombine known parts readily. However, existing neural network based models have been proven to be extremely deficient in…

Artificial Intelligence · Computer Science 2020-10-27 Qian Liu , Shengnan An , Jian-Guang Lou , Bei Chen , Zeqi Lin , Yan Gao , Bin Zhou , Nanning Zheng , Dongmei Zhang

Batch Aggregation: An Approach to Enhance Text Classification with Correlated Augmented Data

Natural language processing models often face challenges due to limited labeled data, especially in domain specific areas, e.g., clinical trials. To overcome this, text augmentation techniques are commonly used to increases sample size by…

Computation and Language · Computer Science 2025-04-08 Charco Hui , Yalu Wen

Incorporating Multiple Cluster Centers for Multi-Label Learning

Multi-label learning deals with the problem that each instance is associated with multiple labels simultaneously. Most of the existing approaches aim to improve the performance of multi-label learning by exploiting label correlations.…

Machine Learning · Computer Science 2022-01-19 Senlin Shu , Fengmao Lv , Yan Yan , Li Li , Shuo He , Jun He

SpeechMLC: Speech Multi-label Classification

In this paper, we propose a multi-label classification framework to detect multiple speaking styles in a speech sample. Unlike previous studies that have primarily focused on identifying a single target style, our framework effectively…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-19 Miseul Kim , Seyun Um , Hyeonjin Cha , Hong-goo Kang

Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification

The explosion of textual data has made manual document classification increasingly challenging. To address this, we introduce a robust, efficient domain-agnostic generative model framework for multi-label text classification. Instead of…

Computation and Language · Computer Science 2025-07-22 Subhendu Khatuya , Shashwat Naidu , Saptarshi Ghosh , Pawan Goyal , Niloy Ganguly

Does Data Scaling Lead to Visual Compositional Generalization?

Compositional understanding is crucial for human intelligence, yet it remains unclear whether contemporary vision models exhibit it. The dominant machine learning paradigm is built on the premise that scaling data and model sizes will…

Machine Learning · Computer Science 2025-07-10 Arnas Uselis , Andrea Dittadi , Seong Joon Oh