Related papers: Text2Data: Low-Resource Data Generation with Textu…

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios

Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in…

Computation and Language · Computer Science 2021-04-12 Michael A. Hedderich , Lukas Lange , Heike Adel , Jannik Strötgen , Dietrich Klakow

Natural language guidance of high-fidelity text-to-speech with synthetic annotations

Text-to-speech models trained on large-scale datasets have demonstrated impressive in-context learning capabilities and naturalness. However, control of speaker identity and style in these models typically requires conditioning on reference…

Sound · Computer Science 2024-02-08 Dan Lyth , Simon King

Weakly Supervised Scene Text Generation for Low-resource Languages

A large number of annotated training images is crucial for training successful scene text recognition models. However, collecting sufficient datasets can be a labor-intensive and costly process, particularly for low-resource languages. To…

Computer Vision and Pattern Recognition · Computer Science 2023-06-28 Yangchen Xie , Xinyuan Chen , Hongjian Zhan , Palaiahankote Shivakum , Bing Yin , Cong Liu , Yue Lu

A Semi-Supervised Approach for Low-Resourced Text Generation

Recently, encoder-decoder neural models have achieved great success on text generation tasks. However, one problem of this kind of models is that their performances are usually limited by the scale of well-labeled data, which are very…

Computation and Language · Computer Science 2019-06-04 Hongyu Zang , Xiaojun Wan

Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation

Despite the success of text-to-text pre-trained models in various natural language generation (NLG) tasks, the generation performance is largely restricted by the number of labeled data in downstream tasks, particularly in data-to-text…

Computation and Language · Computer Science 2022-06-07 Pei Ke , Haozhe Ji , Zhenyu Yang , Yi Huang , Junlan Feng , Xiaoyan Zhu , Minlie Huang

Data-to-Text Generation with Style Imitation

Recent neural approaches to data-to-text generation have mostly focused on improving content fidelity while lacking explicit control over writing styles (e.g., word choices, sentence structures). More traditional systems use templates to…

Computation and Language · Computer Science 2020-10-12 Shuai Lin , Wentao Wang , Zichao Yang , Xiaodan Liang , Frank F. Xu , Eric Xing , Zhiting Hu

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

Minimally Supervised Categorization of Text with Metadata

Document categorization, which aims to assign a topic label to each document, plays a fundamental role in a wide variety of applications. Despite the success of existing studies in conventional supervised document classification, they are…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Yu Meng , Jiaxin Huang , Frank F. Xu , Xuan Wang , Jiawei Han

Improving Logical-Level Natural Language Generation with Topic-Conditioned Data Augmentation and Logical Form Generation

Logical Natural Language Generation, i.e., generating textual descriptions that can be logically entailed by a structured table, has been a challenge due to the low fidelity of the generation. \citet{chen2020logic2text} have addressed this…

Computation and Language · Computer Science 2021-12-14 Ao Liu , Congjian Luo , Naoaki Okazaki

Universal Cross-Lingual Text Classification

Text classification, an integral task in natural language processing, involves the automatic categorization of text into predefined classes. Creating supervised labeled datasets for low-resource languages poses a considerable challenge.…

Computation and Language · Computer Science 2024-06-18 Riya Savant , Anushka Shelke , Sakshi Todmal , Sanskruti Kanphade , Ananya Joshi , Raviraj Joshi

Data-to-text Generation with Variational Sequential Planning

We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input. We focus on generating long-form text, i.e., documents with multiple paragraphs, and propose a neural model enhanced with a…

Computation and Language · Computer Science 2022-03-01 Ratish Puduppully , Yao Fu , Mirella Lapata

Data-to-Text Generation with Iterative Text Editing

We present a novel approach to data-to-text generation based on iterative text editing. Our approach maximizes the completeness and semantic accuracy of the output text while leveraging the abilities of recent pre-trained models for text…

Computation and Language · Computer Science 2021-01-29 Zdeněk Kasner , Ondřej Dušek

Unsupervised Natural Language Generation with Denoising Autoencoders

Generating text from structured data is important for various tasks such as question answering and dialog systems. We show that in at least one domain, without any supervision and only based on unlabeled text, we are able to build a Natural…

Computation and Language · Computer Science 2018-08-28 Markus Freitag , Scott Roy

Text2Weight: Bridging Natural Language and Neural Network Weight Spaces

How far are we really from automatically generating neural networks? While neural network weight generation shows promise, current approaches struggle with generalization to unseen tasks and practical application exploration. To address…

Machine Learning · Computer Science 2025-08-20 Bowen Tian , Wenshuo Chen , Zexi Li , Songning Lai , Jiemin Wu , Yutao Yue

Unsupervised Data Validation Methods for Efficient Model Training

This paper investigates the challenges and potential solutions for improving machine learning systems for low-resource languages. State-of-the-art models in natural language processing (NLP), text-to-speech (TTS), speech-to-text (STT), and…

Computation and Language · Computer Science 2024-10-11 Yurii Paniv

Faithful Low-Resource Data-to-Text Generation through Cycle Training

Methods to generate text from structured data have advanced significantly in recent years, primarily due to fine-tuning of pre-trained language models on large datasets. However, such models can fail to produce output faithful to the input…

Computation and Language · Computer Science 2023-07-12 Zhuoer Wang , Marcus Collins , Nikhita Vedula , Simone Filice , Shervin Malmasi , Oleg Rokhlenko

Simple and Effective Unsupervised Speech Translation

The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages. To address this issue,…

Computation and Language · Computer Science 2022-10-20 Changhan Wang , Hirofumi Inaguma , Peng-Jen Chen , Ilia Kulikov , Yun Tang , Wei-Ning Hsu , Michael Auli , Juan Pino

Revisiting Interpolation Augmentation for Speech-to-Text Generation

Speech-to-text (S2T) generation systems frequently face challenges in low-resource scenarios, primarily due to the lack of extensive labeled datasets. One emerging solution is constructing virtual training samples by interpolating inputs…

Computation and Language · Computer Science 2024-06-25 Chen Xu , Jie Wang , Xiaoqian Liu , Qianqian Dong , Chunliang Zhang , Tong Xiao , Jingbo Zhu , Dapeng Man , Wu Yang

Efficient and Training-Free Control of Language Generation

In recent years, there has been a growing interest in the development of language models capable of generating text with controllable attributes. While several approaches have been proposed, many of these methods require condition-specific…

Computation and Language · Computer Science 2023-02-22 Shangda Wu , Maosong Sun

Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain

Existing data-to-text generation efforts mainly focus on generating a coherent text from non-linguistic input data, such as tables and attribute-value pairs, but overlook that different application scenarios may require texts of different…

Computation and Language · Computer Science 2023-05-08 Liqiang Jing , Xuemeng Song , Xuming Lin , Zhongzhou Zhao , Wei Zhou , Liqiang Nie