English
Related papers

Related papers: BERTGEN: Multi-task Generation through BERT

200 papers

Pre-trained language models have recently contributed to significant advances in NLP tasks. Recently, multi-modal versions of BERT have been developed, using heavy pre-training relying on vast corpora of aligned textual and image data,…

Computation and Language · Computer Science 2020-12-17 Thomas Scialom , Patrick Bordes , Paul-Alexis Dray , Jacopo Staiano , Patrick Gallinari

The multilingual BERT model is trained on 104 languages and meant to serve as a universal language model and tool for encoding sentences. We explore how well the model performs on several languages across several tasks: a diagnostic…

Computation and Language · Computer Science 2019-10-10 Samuel Rönnqvist , Jenna Kanerva , Tapio Salakoski , Filip Ginter

Multilingual pretrained language models have demonstrated remarkable zero-shot cross-lingual transfer capabilities. Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen…

Computation and Language · Computer Science 2021-01-28 Benjamin Muller , Yanai Elazar , Benoît Sagot , Djamé Seddah

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional…

Computation and Language · Computer Science 2019-05-28 Jacob Devlin , Ming-Wei Chang , Kenton Lee , Kristina Toutanova

The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks. Using the same architecture and parameters, we developed and evaluated a monolingual…

Computation and Language · Computer Science 2019-12-23 Wietse de Vries , Andreas van Cranenburgh , Arianna Bisazza , Tommaso Caselli , Gertjan van Noord , Malvina Nissim

Recently, the bidirectional encoder representations from transformers (BERT) model has attracted much attention in the field of natural language processing, owing to its high performance in language understanding-related tasks. The BERT…

Machine Learning · Computer Science 2020-04-16 Kazuki Miyazawa , Tatsuya Aoki , Takato Horii , Takayuki Nagai

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104…

Computation and Language · Computer Science 2019-10-04 Shijie Wu , Mark Dredze

Multilingual BERT (mBERT), a language model pre-trained on large multilingual corpora, has impressive zero-shot cross-lingual transfer capabilities and performs surprisingly well on zero-shot POS tagging and Named Entity Recognition (NER),…

Computation and Language · Computer Science 2022-05-18 Beiduo Chen , Wu Guo , Quan Liu , Kun Tao

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language. We extend the popular BERT architecture to a multi-modal two-stream model, pro-cessing…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Jiasen Lu , Dhruv Batra , Devi Parikh , Stefan Lee

Large-scale pre-trained language model such as BERT has achieved great success in language understanding tasks. However, it remains an open question how to utilize BERT for language generation. In this paper, we present a novel approach,…

Computation and Language · Computer Science 2020-07-21 Yen-Chun Chen , Zhe Gan , Yu Cheng , Jingzhou Liu , Jingjing Liu

Deep learning-based language models pretrained on large unannotated text corpora have been demonstrated to allow efficient transfer learning for natural language processing, with recent approaches such as the transformer-based BERT model…

Computation and Language · Computer Science 2019-12-17 Antti Virtanen , Jenna Kanerva , Rami Ilo , Jouni Luoma , Juhani Luotolahti , Tapio Salakoski , Filip Ginter , Sampo Pyysalo

In this study, we implement a novel BERT architecture for multitask fine-tuning on three downstream tasks: sentiment classification, paraphrase detection, and semantic textual similarity prediction. Our model, Multitask BERT, incorporates…

Computation and Language · Computer Science 2024-08-29 Christopher Sun , Abishek Satish

In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in…

Computation and Language · Computer Science 2019-06-05 Telmo Pires , Eva Schlinger , Dan Garrette

Encoder-only languages models are frequently used for a variety of standard machine learning tasks, including classification and retrieval. However, there has been a lack of recent research for encoder models, especially with respect to…

Computation and Language · Computer Science 2025-09-09 Marc Marone , Orion Weller , William Fleshman , Eugene Yang , Dawn Lawrie , Benjamin Van Durme

Multi-task learning (MTL) has achieved remarkable success in natural language processing applications. In this work, we study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language…

Computation and Language · Computer Science 2020-05-07 Yifan Peng , Qingyu Chen , Zhiyong Lu

Recently, Natural Language Processing (NLP) has witnessed an impressive progress in many areas, due to the advent of novel, pretrained contextual representation models. In particular, Devlin et al. (2019) proposed a model, called BERT…

Computation and Language · Computer Science 2020-03-09 Debora Nozza , Federico Bianchi , Dirk Hovy

For a computer to naturally interact with a human, it needs to be human-like. In this paper, we propose a neural response generation model with multi-task learning of generation and classification, focusing on emotion. Our model based on…

Computation and Language · Computer Science 2021-05-26 Tatsuya Ide , Daisuke Kawahara

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models. Diffusion models and many pre-trained language models have a shared training objective, i.e., denoising, making it possible to combine the…

Computation and Language · Computer Science 2022-12-02 Zhengfu He , Tianxiang Sun , Kuanning Wang , Xuanjing Huang , Xipeng Qiu

Multi-modal pretraining for learning high-level multi-modal representation is a further step towards deep learning and artificial intelligence. In this work, we propose a novel model, namely InterBERT (BERT for Interaction), which is the…

Computation and Language · Computer Science 2021-04-23 Junyang Lin , An Yang , Yichang Zhang , Jie Liu , Jingren Zhou , Hongxia Yang

Large pre-trained language models help to achieve state of the art on a variety of natural language processing (NLP) tasks, nevertheless, they still suffer from forgetting when incrementally learning a sequence of tasks. To alleviate this…

Computation and Language · Computer Science 2023-03-03 Mingxu Tao , Yansong Feng , Dongyan Zhao
‹ Prev 1 2 3 10 Next ›