English
Related papers

Related papers: Leveraging Data Augmentation for Process Informati…

200 papers

We present data augmentation techniques for process extraction tasks in scientific publications. We cast the process extraction task as a sequence labeling task where we identify all the entities in a sentence and label them according to…

Computation and Language · Computer Science 2025-04-16 Yuni Susanti

Data augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the…

Computation and Language · Computer Science 2021-05-31 Wei Bi , Huayang Li , Jiacheng Huang

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve…

Computation and Language · Computer Science 2022-07-25 Markus Bayer , Marc-André Kaufhold , Björn Buchhold , Marcel Keller , Jörg Dallmeyer , Christian Reuter

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in…

Computation and Language · Computer Science 2022-06-28 Bohan Li , Yutai Hou , Wanxiang Che

Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too. Previous research primarily explored code pre-training and…

Computation and Language · Computer Science 2023-02-08 Pinzhen Chen , Gerasimos Lampouras

Machine-learning based generation of process models from natural language text process descriptions provides a solution for the time-intensive and expensive process discovery phase. Many organizations have to carry out this phase, before…

Computation and Language · Computer Science 2024-10-03 Julian Neuberger , Han van der Aa , Lars Ackermann , Daniel Buschek , Jannic Herrmann , Stefan Jablonski

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could…

Computation and Language · Computer Science 2025-02-03 Yaping Chai , Haoran Xie , Joe S. Qin

Large models, encompassing large language and diffusion models, have shown exceptional promise in approximating human-level intelligence, garnering significant interest from both academic and industrial spheres. However, the training of…

Machine Learning · Computer Science 2024-03-05 Yue Zhou , Chenlu Guo , Xu Wang , Yi Chang , Yuan Wu

In the Machine Learning research community, there is a consensus regarding the relationship between model complexity and the required amount of data and computation power. In real world applications, these computational requirements are not…

Machine Learning · Computer Science 2022-08-03 Joao Fonseca , Fernando Bacao

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation. However, the quality of augmented data depends heavily on…

Computation and Language · Computer Science 2024-04-30 Yichuan Li , Kaize Ding , Jianling Wang , Kyumin Lee

In this paper, we propose a pipeline leveraging Large Language Models (LLMs) for data augmentation in Information Extraction tasks within the legal domain. The proposed method is both simple and effective, significantly reducing the manual…

Computation and Language · Computer Science 2026-01-12 Nguyen Minh Phuong , Ha-Thanh Nguyen , May Myo Zin , Ken Satoh

Simple yet effective data augmentation techniques have been proposed for sentence-level and sentence-pair natural language processing tasks. Inspired by these efforts, we design and compare data augmentation for named entity recognition,…

Computation and Language · Computer Science 2020-10-23 Xiang Dai , Heike Adel

While the abundance of rich and vast datasets across numerous fields has facilitated the advancement of natural language processing, sectors in need of specialized data types continue to struggle with the challenge of finding quality data.…

Computation and Language · Computer Science 2026-02-06 Hyeonseok Kang , Hyein Seo , Jeesu Jung , Sangkeun Jung , Du-Seong Chang , Riwoo Chung

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of…

Computation and Language · Computer Science 2021-01-12 Ping Yu , Ruiyi Zhang , Yang Zhao , Yizhe Zhang , Chunyuan Li , Changyou Chen

Over the past decade, extensive research efforts have been dedicated to the extraction of information from textual process descriptions. Despite the remarkable progress witnessed in natural language processing (NLP), information extraction…

Computation and Language · Computer Science 2024-07-29 Julian Neuberger , Lars Ackermann , Han van der Aa , Stefan Jablonski

Text data augmentation is an effective strategy for overcoming the challenge of limited sample sizes in many natural language processing (NLP) tasks. This challenge is especially prominent in the few-shot learning scenario, where the data…

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited. In this paper, we present a novel data augmentation method for…

Computation and Language · Computer Science 2019-05-28 Jinhua Zhu , Fei Gao , Lijun Wu , Yingce Xia , Tao Qin , Wengang Zhou , Xueqi Cheng , Tie-Yan Liu

Data scarcity is a problem that occurs in languages and tasks where we do not have large amounts of labeled data but want to use state-of-the-art models. Such models are often deep learning models that require a significant amount of data…

Computation and Language · Computer Science 2023-02-23 Domagoj Pluščec , Jan Šnajder

Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks.…

Computation and Language · Computer Science 2024-09-18 Frédéric Piedboeuf , Philippe Langlais
‹ Prev 1 2 3 10 Next ›