English
Related papers

Related papers: Improving short text classification through global…

200 papers

Data augmentation techniques are widely used in text classification tasks to improve the performance of classifiers, especially in low-resource scenarios. Most previous methods conduct text augmentation without considering the different…

Computation and Language · Computer Science 2022-09-07 Biyang Guo , Songqiao Han , Hailiang Huang

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve…

Computation and Language · Computer Science 2022-07-25 Markus Bayer , Marc-André Kaufhold , Björn Buchhold , Marcel Keller , Jörg Dallmeyer , Christian Reuter

Mixup, a recent proposed data augmentation method through linearly interpolating inputs and modeling targets of random samples, has demonstrated its capability of significantly improving the predictive accuracy of the state-of-the-art…

Computation and Language · Computer Science 2019-05-23 Hongyu Guo , Yongyi Mao , Richong Zhang

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Text augmentation is an effective technique for addressing the problem of insufficient data in natural language processing. However, existing text augmentation methods tend to focus on few-shot scenarios and usually perform poorly on large…

Computation and Language · Computer Science 2024-04-02 Heng Yang , Ke Li

Short-text classification, like all data science, struggles to achieve high performance using limited data. As a solution, a short sentence may be expanded with new and relevant feature words to form an artificially enlarged dataset, and…

Computation and Language · Computer Science 2019-09-18 Duncan Cameron-Steinke

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited. In this paper, we present a novel data augmentation method for…

Computation and Language · Computer Science 2019-05-28 Jinhua Zhu , Fei Gao , Lijun Wu , Yingce Xia , Tao Qin , Wengang Zhou , Xueqi Cheng , Tie-Yan Liu

Dementia is a progressive neurological disorder that profoundly affects the daily lives of older adults, impairing abilities such as verbal communication and cognitive function. Early diagnosis is essential for enhancing both lifespan and…

Computational Engineering, Finance, and Science · Computer Science 2023-11-07 Kaiying Lin , Peter Washington

Text augmentation techniques are widely used in text classification problems to improve the performance of classifiers, especially in low-resource scenarios. Whilst lots of creative text augmentation methods have been designed, they augment…

Computation and Language · Computer Science 2021-09-02 Biyang Guo , Sonqiao Han , Hailiang Huang

Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to improve predictive performance. Synthetic data generation is common in numerous domains. However, recently text augmentation has emerged in…

Computation and Language · Computer Science 2023-09-12 Mosleh Mahamud , Zed Lee , Isak Samsten

Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking. Algorithms like word2vec, GloVe and others are also key factors in many…

Information Retrieval · Computer Science 2019-09-05 Tommaso Teofili , Niyati Chhaya

In this work we investigate the impact of applying textual data augmentation tasks to low resource machine translation. There has been recent interest in investigating approaches for training systems for languages with limited resources and…

Computation and Language · Computer Science 2023-06-14 Catherine Gitau , VUkosi Marivate

Multimodal Person Reidentification is gaining popularity in the research community due to its effectiveness compared to counter-part unimodal frameworks. However, the bottleneck for multimodal deep learning is the need for a large volume of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Mulham Fawakherji , Eduard Vazquez , Pasquale Giampa , Binod Bhattarai

In this paper, we investigate data augmentation for text generation, which we call GenAug. Text generation and language modeling are important tasks within natural language processing, and are especially challenging for low-data regimes. We…

Computation and Language · Computer Science 2020-10-13 Steven Y. Feng , Varun Gangal , Dongyeop Kang , Teruko Mitamura , Eduard Hovy

Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for fine-tuning but lacks…

Computation and Language · Computer Science 2021-11-19 Kang Min Yoo , Dongju Park , Jaewook Kang , Sang-Woo Lee , Woomyeong Park

Data augmentation has proven widely effective in computer vision. In Natural Language Processing (NLP) data augmentation remains an area of active research. There is no widely accepted augmentation technique that works well across tasks and…

Computation and Language · Computer Science 2023-03-07 Isabel Garcia Pietri , Kineret Stanley

Research on data generation and augmentation has been focused majorly on enhancing generation models, leaving a notable gap in the exploration and refinement of methods for evaluating synthetic data. There are several text similarity…

Computation and Language · Computer Science 2023-11-09 Tiasa Singha Roy , Priyam Basu

Effectively making sense of short texts is a critical task for many real world applications such as search engines, social media services, and recommender systems. The task is particularly challenging as a short text contains very sparse…

Computation and Language · Computer Science 2017-09-04 Jian Tang , Yue Wang , Kai Zheng , Qiaozhu Mei

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an…

Computation and Language · Computer Science 2018-07-11 Vincent Major , Alisa Surkis , Yindalon Aphinyanaphongs

This paper proposes a simple yet effective interpolation-based data augmentation approach termed DoubleMix, to improve the robustness of models in text classification. DoubleMix first leverages a couple of simple augmentation operations to…

Computation and Language · Computer Science 2022-09-13 Hui Chen , Wei Han , Diyi Yang , Soujanya Poria
‹ Prev 1 2 3 10 Next ›