English
Related papers

Related papers: MAPGN: MAsked Pointer-Generator Network for sequen…

200 papers

Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several…

Machine Learning · Statistics 2018-03-02 William Fedus , Ian Goodfellow , Andrew M. Dai

Word alignment, which aims to align translationally equivalent words between source and target sentences, plays an important role in many natural language processing tasks. Current unsupervised neural alignment methods focus on inducing…

Computation and Language · Computer Science 2021-05-18 Chi Chen , Maosong Sun , Yang Liu

Prompt learning has achieved great success in efficiently exploiting large-scale pre-trained models in natural language processing (NLP). It reformulates the downstream tasks as the generative pre-training ones to achieve consistency, thus…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 Ning Liao , Bowen Shi , Xiaopeng Zhang , Min Cao , Junchi Yan , Qi Tian

In this paper, we generalize text infilling (e.g., masked language models) by proposing Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective. SSR provides more fine-grained learning…

Computation and Language · Computer Science 2021-09-27 Wangchunshu Zhou , Tao Ge , Canwen Xu , Ke Xu , Furu Wei

This paper presents methods of making using of text supervision to improve the performance of sequence-to-sequence (seq2seq) voice conversion. Compared with conventional frame-to-frame voice conversion approaches, the seq2seq acoustic…

Sound · Computer Science 2020-01-14 Jing-Xuan Zhang , Zhen-Hua Ling , Yuan Jiang , Li-Juan Liu , Chen Liang , Li-Rong Dai

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two…

Computation and Language · Computer Science 2017-04-26 Abigail See , Peter J. Liu , Christopher D. Manning

This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models. In our method, the weights of the encoder and decoder of a seq2seq model are initialized with the pretrained weights…

Computation and Language · Computer Science 2018-02-23 Prajit Ramachandran , Peter J. Liu , Quoc V. Le

Recent neural sequence-to-sequence models with a copy mechanism have achieved remarkable progress in various text generation tasks. These models addressed out-of-vocabulary problems and facilitated the generation of rare words. However, the…

Computation and Language · Computer Science 2021-12-21 Sanghyuk Choi , Jeong-in Hwang , Hyungjong Noh , Yeonsoo Lee

Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-12 Kai-Wei Chang , Wei-Cheng Tseng , Shang-Wen Li , Hung-yi Lee

The recent large-scale text-to-speech (TTS) systems are usually grouped as autoregressive and non-autoregressive systems. The autoregressive systems implicitly model duration but exhibit certain deficiencies in robustness and lack of…

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the…

Machine Learning · Computer Science 2015-09-24 Samy Bengio , Oriol Vinyals , Navdeep Jaitly , Noam Shazeer

Agents that can follow language instructions are expected to be useful in a variety of situations such as navigation. However, training neural network-based agents requires numerous paired trajectories and languages. This paper proposes…

Machine Learning · Computer Science 2023-01-03 Kei Akuzawa , Yusuke Iwasawa , Yutaka Matsuo

Neural text-to-speech (TTS) models can synthesize natural human speech when trained on large amounts of transcribed speech. However, collecting such large-scale transcribed data is expensive. This paper proposes an unsupervised pre-training…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-29 Seongyeon Park , Myungseo Song , Bohyung Kim , Tae-Hyun Oh

Self-supervised pre-training has been successful in both text and speech processing. Speech and text offer different but complementary information. The question is whether we are able to perform a speech-text joint pre-training on unpaired…

Computation and Language · Computer Science 2022-11-01 Xianghu Yue , Junyi Ao , Xiaoxue Gao , Haizhou Li

Unsupervised clustering on speakers is becoming increasingly important for its potential uses in semi-supervised learning. In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions.…

Audio and Speech Processing · Electrical Eng. & Systems 2022-04-26 Fuchuan Tong , Siqi Zheng , Min Zhang , Yafeng Chen , Hongbin Suo , Qingyang Hong , Lin Li

Recent advances in neural network -based text-to-speech have reached human level naturalness in synthetic speech. The present sequence-to-sequence models can directly map text to mel-spectrogram acoustic features, which are convenient for…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-27 Lauri Juvela , Bajibabu Bollepalli , Junichi Yamagishi , Paavo Alku

Recurrent Neural Networks (RNNs) have become the standard modeling technique for sequence data, and are used in a number of novel text-to-speech models. However, training a TTS model including RNN components has certain requirements for GPU…

Computation and Language · Computer Science 2023-04-18 Ziqi Liang

Self-supervised learning has emerged as a powerful approach for leveraging large-scale unlabeled data to improve model performance in various domains. In this paper, we explore masked self-supervised pre-training for text recognition…

Computer Vision and Pattern Recognition · Computer Science 2025-03-31 Martin Kišš , Michal Hradiš

Copying mechanism shows effectiveness in sequence-to-sequence based neural network models for text generation tasks, such as abstractive sentence summarization and question generation. However, existing works on modeling copying or pointing…

Computation and Language · Computer Science 2018-07-09 Qingyu Zhou , Nan Yang , Furu Wei , Ming Zhou

We introduce a novel sequence-to-sequence (seq2seq) voice conversion (VC) model based on the Transformer architecture with text-to-speech (TTS) pretraining. Seq2seq VC models are attractive owing to their ability to convert prosody. While…

Audio and Speech Processing · Electrical Eng. & Systems 2019-12-17 Wen-Chin Huang , Tomoki Hayashi , Yi-Chiao Wu , Hirokazu Kameoka , Tomoki Toda
‹ Prev 1 2 3 10 Next ›