English
Related papers

Related papers: Does Chinese BERT Encode Word Structure?

200 papers

Chinese word segmentation (CWS) is a fundamental task for Chinese language understanding. Recently, neural network-based models have attained superior performance in solving the in-domain CWS task. Last year, Bidirectional Encoder…

Computation and Language · Computer Science 2019-09-23 Haiqin Yang

How and to what extent does BERT encode syntactically-sensitive hierarchical information or positionally-sensitive linear information? Recent work has shown that contextual representations like BERT perform well on tasks that require…

Computation and Language · Computer Science 2019-06-06 Yongjie Lin , Yi Chern Tan , Robert Frank

Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a…

Computation and Language · Computer Science 2022-07-14 Wenbiao Li , Rui Sun , Yunfang Wu

Bidirectional Encoder Representations from Transformers or BERT~\cite{devlin-etal-2019-bert} has been one of the base models for various NLP tasks due to its remarkable performance. Variants customized for different languages and tasks are…

Computation and Language · Computer Science 2022-11-22 Ting Han , Kunhao Pan , Xinyu Chen , Dingjie Song , Yuchen Fan , Xinyu Gao , Ruyi Gan , Jiaxing Zhang

The pre-training of text encoders normally processes text as a sequence of tokens corresponding to small text units, such as word pieces in English and characters in Chinese. It omits information carried by larger text granularity, and thus…

Computation and Language · Computer Science 2019-11-05 Shizhe Diao , Jiaxin Bai , Yan Song , Tong Zhang , Yonggang Wang

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic…

Machine Learning · Computer Science 2019-10-29 Andy Coenen , Emily Reif , Ann Yuan , Been Kim , Adam Pearce , Fernanda Viégas , Martin Wattenberg

BERT-based models have shown a remarkable ability in the Chinese Spelling Check (CSC) task recently. However, traditional BERT-based methods still suffer from two limitations. First, although previous works have identified that explicit…

Computation and Language · Computer Science 2023-12-29 Yongchang Cao , Liang He , Zhen Wu , Xinyu Dai

Recent advances in large-scale language representation models such as BERT have improved the state-of-the-art performances in many NLP tasks. Meanwhile, character-level Chinese NLP models, including BERT for Chinese, have also demonstrated…

Computation and Language · Computer Science 2020-04-09 Boxin Wang , Boyuan Pan , Xin Li , Bo Li

In this work, we represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition (NER) tasks in a natural manner. Instead of using word embeddings and a newly designed transformer layer as in…

Computation and Language · Computer Science 2021-04-19 Wei Zhu , Daniel Cheung

In this paper we investigate the linguistic knowledge learned by a Neural Language Model (NLM) before and after a fine-tuning process and how this knowledge affects its predictions during several classification problems. We use a wide set…

Computation and Language · Computer Science 2024-02-27 Alessio Miaschi , Dominique Brunato , Felice Dell'Orletta , Giulia Venturi

Lexicon information and pre-trained models, such as BERT, have been combined to explore Chinese sequence labelling tasks due to their respective strengths. However, existing methods solely fuse lexicon features via a shallow and random…

Computation and Language · Computer Science 2021-12-28 Wei Liu , Xiyan Fu , Yue Zhang , Wenming Xiao

With the rapid development of information technology, online platforms (e.g., news portals and social media) generate enormous web information every moment. Therefore, it is crucial to extract structured representations of events from…

Computation and Language · Computer Science 2021-12-21 Jiangwei Liu , Jingshu Zhang , Xiaohong Huang , Liangyu Min

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference…

Computation and Language · Computer Science 2020-02-05 Zhuosheng Zhang , Yuwei Wu , Hai Zhao , Zuchao Li , Shuailiang Zhang , Xi Zhou , Xiang Zhou

Pretrained language models (PLMs) have shown marvelous improvements across various NLP tasks. Most Chinese PLMs simply treat an input text as a sequence of characters, and completely ignore word information. Although Whole Word Masking can…

Computation and Language · Computer Science 2023-03-23 Xinnian Liang , Zefan Zhou , Hui Huang , Shuangzhi Wu , Tong Xiao , Muyun Yang , Zhoujun Li , Chao Bian

Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese -- Lattice-BERT, which explicitly…

Computation and Language · Computer Science 2021-05-31 Yuxuan Lai , Yijia Liu , Yansong Feng , Songfang Huang , Dongyan Zhao

Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social…

Computation and Language · Computer Science 2019-06-19 Keita Kurita , Nidhi Vyas , Ayush Pareek , Alan W Black , Yulia Tsvetkov

Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network. We find that the model represents the…

Computation and Language · Computer Science 2019-08-12 Ian Tenney , Dipanjan Das , Ellie Pavlick

The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among…

Computation and Language · Computer Science 2021-02-11 Laura Pérez-Mayos , Roberto Carlini , Miguel Ballesteros , Leo Wanner

We propose a new Named entity recognition (NER) method to effectively make use of the results of Part-of-speech (POS) tagging, Chinese word segmentation (CWS) and parsing while avoiding NER error caused by POS tagging error. This paper…

Computation and Language · Computer Science 2021-01-28 Xiao Fu , Guijun Zhang

Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: glyph and pinyin, which carry significant syntax and semantic information for language understanding. In this work, we propose ChineseBERT,…

Computation and Language · Computer Science 2021-07-01 Zijun Sun , Xiaoya Li , Xiaofei Sun , Yuxian Meng , Xiang Ao , Qing He , Fei Wu , Jiwei Li
‹ Prev 1 2 3 10 Next ›