English
Related papers

Related papers: Structural Self-Supervised Objectives for Transfor…

200 papers

Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations. MLM trains a model to predict a random sample of input tokens that have been replaced…

Computation and Language · Computer Science 2021-09-07 Atsuki Yamaguchi , George Chrysostomou , Katerina Margatina , Nikolaos Aletras

Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT), by drastically reducing the need for large parallel data. Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence…

Computation and Language · Computer Science 2021-06-11 Christos Baziotis , Ivan Titov , Alexandra Birch , Barry Haddow

Transformers have gained popularity in the software engineering (SE) literature. These deep learning models are usually pre-trained through a self-supervised objective, meant to provide the model with basic knowledge about a language of…

Software Engineering · Computer Science 2023-02-09 Rosalia Tufano , Luca Pascarella , Gabriele Bavota

Large Transformer-based language models are pre-trained on corpora of varying sizes, for a different number of steps and with different batch sizes. At the same time, more fundamental components, such as the pre-training objective or…

Computation and Language · Computer Science 2021-05-12 M. Aßenmacher , P. Schulze , C. Heumann

Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens. While they produce good results when transferred to…

Computation and Language · Computer Science 2020-03-25 Kevin Clark , Minh-Thang Luong , Quoc V. Le , Christopher D. Manning

The recent rapid progress in pre-training Large Language Models has relied on using self-supervised language modeling objectives like next token prediction or span corruption. On the other hand, Machine Translation Systems are mostly…

Computation and Language · Computer Science 2023-05-22 Andrea Schioppa , Xavier Garcia , Orhan Firat

Unsupervised pre-training is now the predominant approach for both text and speech understanding. Self-attention models pre-trained on large amounts of unannotated data have been hugely successful when fine-tuned on downstream tasks from a…

Computation and Language · Computer Science 2021-10-22 Ankur Bapna , Yu-an Chung , Nan Wu , Anmol Gulati , Ye Jia , Jonathan H. Clark , Melvin Johnson , Jason Riesa , Alexis Conneau , Yu Zhang

Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e.g. BERT) with the aim of learning better language representations. However, to the best of our knowledge, no…

Computation and Language · Computer Science 2022-03-22 Ahmed Alajrami , Nikolaos Aletras

BERT set many state-of-the-art results over varied NLU benchmarks by pre-training over two tasks: masked language modelling (MLM) and next sentence prediction (NSP), the latter of which has been highly criticized. In this paper, we 1)…

Computation and Language · Computer Science 2020-10-06 Stephane Aroca-Ouellette , Frank Rudzicz

Transformer has been widely used for self-supervised pre-training in Natural Language Processing (NLP) and achieved great success. However, it has not been fully explored in visual self-supervised learning. Meanwhile, previous methods only…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Zhaowen Li , Zhiyang Chen , Fan Yang , Wei Li , Yousong Zhu , Chaoyang Zhao , Rui Deng , Liwei Wu , Rui Zhao , Ming Tang , Jinqiao Wang

Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios.…

Computation and Language · Computer Science 2022-10-20 Hongqiu Wu , Ruixue Ding , Hai Zhao , Boli Chen , Pengjun Xie , Fei Huang , Min Zhang

Pretrained language models (PLMs) display impressive performances and have captured the attention of the NLP community. Establishing best practices in pretraining has, therefore, become a major focus of NLP research, especially since…

Computation and Language · Computer Science 2024-10-08 Zihao Li , Shaoxiong Ji , Timothee Mickus , Vincent Segonne , Jörg Tiedemann

We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Recent pre-training methods in NLP focus on learning either bottom or top-level…

Computation and Language · Computer Science 2020-11-02 Haejun Lee , Drew A. Hudson , Kangwook Lee , Christopher D. Manning

Self-supervised pre-training of transformer models has revolutionized NLP applications. Such pre-training with language modeling objectives provides a useful initial point for parameters that generalize well to new tasks with fine-tuning.…

Computation and Language · Computer Science 2020-11-17 Trapit Bansal , Rishikesh Jha , Tsendsuren Munkhdalai , Andrew McCallum

Masked language modeling (MLM) is one of the key sub-tasks in vision-language pretraining. In the cross-modal setting, tokens in the sentence are masked at random, and the model predicts the masked tokens given the image and the text. In…

Computation and Language · Computer Science 2021-09-07 Yonatan Bitton , Gabriel Stanovsky , Michael Elhadad , Roy Schwartz

Pre-training a language model and then fine-tuning it for downstream tasks has demonstrated state-of-the-art results for various NLP tasks. Pre-training is usually independent of the downstream task, and previous works have shown that this…

Computation and Language · Computer Science 2022-11-28 Tanish Lad , Himanshu Maheshwari , Shreyas Kottukkal , Radhika Mamidi

Pre-trained Language Models (PLMs) have been widely used in various natural language processing (NLP) tasks, owing to their powerful text representations trained on large-scale corpora. In this paper, we propose a new PLM called PERT for…

Computation and Language · Computer Science 2022-03-15 Yiming Cui , Ziqing Yang , Ting Liu

Transformers have gained increasing popularity in a wide range of applications, including Natural Language Processing (NLP), Computer Vision and Speech Recognition, because of their powerful representational capacity. However, harnessing…

Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised…

Computation and Language · Computer Science 2021-08-31 Katikapalli Subramanyam Kalyan , Ajit Rajasekharan , Sivanesan Sangeetha

Pre-trained models are widely used in the tasks of natural language processing nowadays. However, in the specific field of text simplification, the research on improving pre-trained models is still blank. In this work, we propose a…

Computation and Language · Computer Science 2022-04-19 Renliang Sun , Xiaojun Wan
‹ Prev 1 2 3 10 Next ›