BERTGEN: Multi-task Generation through BERT

Faidon Mitzalis; Ozan Caglayan; Pranava Madhyastha; Lucia Specia

BERTGEN: Multi-task Generation through BERT

Computation and Language 2021-06-08 v1

Authors: Faidon Mitzalis , Ozan Caglayan , Pranava Madhyastha , Lucia Specia

Abstract

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively. BERTGEN is auto-regressively trained for language generation tasks, namely image captioning, machine translation and multimodal machine translation, under a multitask setting. With a comprehensive set of evaluations, we show that BERTGEN outperforms many strong baselines across the tasks explored. We also show BERTGEN's ability for zero-shot language generation, where it exhibits competitive performance to supervised counterparts. Finally, we conduct ablation studies which demonstrate that BERTGEN substantially benefits from multi-tasking and effectively transfers relevant inductive biases from the pre-trained models.

Keywords

bert pre-trained language model text generation

Cite

@article{arxiv.2106.03484,
  title  = {BERTGEN: Multi-task Generation through BERT},
  author = {Faidon Mitzalis and Ozan Caglayan and Pranava Madhyastha and Lucia Specia},
  journal= {arXiv preprint arXiv:2106.03484},
  year   = {2021}
}

Comments

Accepted to ACL 2021 Main Conference

BERTGEN: Multi-task Generation through BERT

Abstract

Keywords

Cite

Comments

Related papers