Related papers: LIMIT-BERT : Linguistic Informed Multi-Task BERT

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. However, due to typological differences across languages,…

Computation and Language · Computer Science 2021-06-07 Wasi Uddin Ahmad , Haoran Li , Kai-Wei Chang , Yashar Mehdad

How Language-Neutral is Multilingual BERT?

Multilingual BERT (mBERT) provides sentence representations for 104 languages, which are useful for many multi-lingual tasks. Previous work probed the cross-linguality of mBERT using zero-shot transfer learning on morphological and…

Computation and Language · Computer Science 2019-11-11 Jindřich Libovický , Rudolf Rosa , Alexander Fraser

A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling

Semantic role labeling (SRL) aims to identify the predicate-argument structure of a sentence. Inspired by the strong correlation between syntax and semantics, previous works pay much attention to improve SRL performance on exploiting…

Computation and Language · Computer Science 2019-11-13 Qingrong Xia , Zhenghua Li , Min Zhang

Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer

Unsupervised cross-lingual transfer involves transferring knowledge between languages without explicit supervision. Although numerous studies have been conducted to improve performance in such tasks by focusing on cross-lingual knowledge,…

Computation and Language · Computer Science 2024-04-26 Jianyu Zheng , Fengfei Fan , Jianquan Li

Hierarchical Multitask Learning Approach for BERT

Recent works show that learning contextualized embeddings for words is beneficial for downstream tasks. BERT is one successful example of this approach. It learns embeddings by solving two tasks, which are masked language model (masked LM)…

Computation and Language · Computer Science 2020-11-10 Çağla Aksoy , Alper Ahmetoğlu , Tunga Güngör

Multi-task Pre-training Language Model for Semantic Network Completion

Semantic networks, such as the knowledge graph, can represent the knowledge leveraging the graph structure. Although the knowledge graph shows promising values in natural language processing, it suffers from incompleteness. This paper…

Computation and Language · Computer Science 2022-04-29 Da Li , Sen Yang , Kele Xu , Ming Yi , Yukai He , Huaimin Wang

BURT: BERT-inspired Universal Representation from Learning Meaningful Segment

Although pre-trained contextualized language models such as BERT achieve significant performance on various downstream tasks, current language representation still only focuses on linguistic objective at a specific granularity, which may…

Computation and Language · Computer Science 2021-01-01 Yian Li , Hai Zhao

An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding

This paper proposes a new principled multi-task representation learning framework (InfoMTL) to extract noise-invariant sufficient representations for all tasks. It ensures sufficiency of shared representations for all tasks and mitigates…

Computation and Language · Computer Science 2025-03-07 Dou Hu , Lingwei Wei , Wei Zhou , Songlin Hu

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on…

Computation and Language · Computer Science 2021-12-28 Marc Tanti , Lonneke van der Plas , Claudia Borg , Albert Gatt

LERT: A Linguistically-motivated Pre-trained Language Model

Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked…

Computation and Language · Computer Science 2022-11-11 Yiming Cui , Wanxiang Che , Shijin Wang , Ting Liu

MetricBERT: Text Representation Learning via Self-Supervised Triplet Training

We present MetricBERT, a BERT-based model that learns to embed text under a well-defined similarity metric while simultaneously adhering to the ``traditional'' masked-language task. We focus on downstream tasks of learning similarities for…

Computation and Language · Computer Science 2022-08-16 Itzik Malkiel , Dvir Ginzburg , Oren Barkan , Avi Caciularu , Yoni Weill , Noam Koenigstein

Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models

Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese -- Lattice-BERT, which explicitly…

Computation and Language · Computer Science 2021-05-31 Yuxuan Lai , Yijia Liu , Yansong Feng , Songfang Huang , Dongyan Zhao

MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning

Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require. In this work, we investigate two techniques for training…

Computation and Language · Computer Science 2023-01-06 Luke Gessler , Amir Zeldes

MTLB-STRUCT @PARSEME 2020: Capturing Unseen Multiword Expressions Using Multi-task Learning and Pre-trained Masked Language Models

This paper describes a semi-supervised system that jointly learns verbal multiword expressions (VMWEs) and dependency parse trees as an auxiliary task. The model benefits from pre-trained multilingual BERT. BERT hidden layers are shared…

Computation and Language · Computer Science 2020-11-06 Shiva Taslimipoor , Sara Bahaadini , Ekaterina Kochmar

Semantics-aware BERT for Language Understanding

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference…

Computation and Language · Computer Science 2020-02-05 Zhuosheng Zhang , Yuwei Wu , Hai Zhao , Zuchao Li , Shuailiang Zhang , Xi Zhou , Xiang Zhou

Finding Universal Grammatical Relations in Multilingual BERT

Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared…

Computation and Language · Computer Science 2020-05-21 Ethan A. Chi , John Hewitt , Christopher D. Manning

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP…

Computation and Language · Computer Science 2021-03-09 Jiangang Bai , Yujing Wang , Yiren Chen , Yaming Yang , Jing Bai , Jing Yu , Yunhai Tong

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applications. These models, however, retain some of the limitations of traditional static word embeddings. In particular, they encode only the…

Computation and Language · Computer Science 2020-04-21 Anne Lauscher , Ivan Vulić , Edoardo Maria Ponti , Anna Korhonen , Goran Glavaš

Improving Contextual Representation with Gloss Regularized Pre-training

Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) encounter the discrepancy between pre-training and inference. In light of this gap, we investigate the contextual representation of…

Computation and Language · Computer Science 2022-05-16 Yu Lin , Zhecheng An , Peihao Wu , Zejun Ma

Distilling Knowledge Learned in BERT for Text Generation

Large-scale pre-trained language model such as BERT has achieved great success in language understanding tasks. However, it remains an open question how to utilize BERT for language generation. In this paper, we present a novel approach,…

Computation and Language · Computer Science 2020-07-21 Yen-Chun Chen , Zhe Gan , Yu Cheng , Jingzhou Liu , Jingjing Liu