Related papers: Many Languages, One Parser

Exploiting Multi-typed Treebanks for Parsing with Deep Multi-task Learning

Various treebanks have been released for dependency parsing. Despite that treebanks may belong to different languages or have different annotation schemes, they contain syntactic knowledge that is potential to benefit each other. This paper…

Computation and Language · Computer Science 2016-06-06 Jiang Guo , Wanxiang Che , Haifeng Wang , Ting Liu

Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank

Pretrained multilingual contextual representations have shown great success, but due to the limits of their pretraining data, their benefits do not apply equally to all language varieties. This presents a challenge for language varieties…

Computation and Language · Computer Science 2022-06-22 Ethan C. Chau , Lucy H. Lin , Noah A. Smith

Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing

We present a dependency parser implemented as a single deep neural network that reads orthographic representations of words and directly generates dependencies and their labels. Unlike typical approaches to parsing, the model doesn't…

Computation and Language · Computer Science 2017-06-07 Jan Chorowski , Michał Zapotoczny , Paweł Rychlikowski

Parser Training with Heterogeneous Treebanks

How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previously suggested, but little evaluated, strategies for exploiting multiple treebanks…

Computation and Language · Computer Science 2018-05-15 Sara Stymne , Miryam de Lhoneux , Aaron Smith , Joakim Nivre

One model, two languages: training bilingual parsers with harmonized treebanks

We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even…

Computation and Language · Computer Science 2016-05-20 David Vilares , Carlos Gómez-Rodríguez , Miguel A. Alonso

Cross-lingual Parsing with Polyglot Training and Multi-treebank Learning: A Faroese Case Study

Cross-lingual dependency parsing involves transferring syntactic knowledge from one language to another. It is a crucial component for inducing dependency parsers in low-resource scenarios where no training data for a language exists. Using…

Computation and Language · Computer Science 2019-10-18 James Barry , Joachim Wagner , Jennifer Foster

Cross-lingual Universal Dependency Parsing Only from One Monolingual Treebank

Syntactic parsing is a highly linguistic processing task whose parser requires training on treebanks from the expensive human annotation. As it is unlikely to obtain a treebank for every human language, in this work, we propose an effective…

Computation and Language · Computer Science 2021-04-26 Kailai Sun , Zuchao Li , Hai Zhao

On Multilingual Training of Neural Dependency Parsers

We show that a recently proposed neural dependency parser can be improved by joint training on multiple languages from the same family. The parser is implemented as a deep neural network whose only input is orthographic representations of…

Computation and Language · Computer Science 2017-05-30 Michał Zapotoczny , Paweł Rychlikowski , Jan Chorowski

Parsing with Pretrained Language Models, Multiple Datasets, and Dataset Embeddings

With an increase of dataset availability, the potential for learning from a variety of data sources has increased. One particular method to improve learning from multiple data sources is to embed the data source during training. This allows…

Computation and Language · Computer Science 2021-12-08 Rob van der Goot , Miryam de Lhoneux

Dependency Language Models for Transition-based Dependency Parsing

In this paper, we present an approach to improve the accuracy of a strong transition-based dependency parser by exploiting dependency language models that are extracted from a large parsed corpus. We integrated a small number of features…

Computation and Language · Computer Science 2017-09-01 Juntao Yu , Bernd Bohnet

Multi-Sense Language Modelling

The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle the same word form having a plurality of meanings (polysemy). Currently, none of the common language…

Computation and Language · Computer Science 2022-06-02 Andrea Lekkas , Peter Schneider-Kamp , Isabelle Augenstein

An Empirical Study of Factors Affecting Language-Independent Models

Scaling existing applications and solutions to multiple human languages has traditionally proven to be difficult, mainly due to the language-dependent nature of preprocessing and feature engineering techniques employed in traditional…

Computation and Language · Computer Science 2020-01-01 Xiaotong Liu , Yingbei Tong , Anbang Xu , Rama Akkiraju

Treebank Embedding Vectors for Out-of-domain Dependency Parsing

A recent advance in monolingual dependency parsing is the idea of a treebank embedding vector, which allows all treebanks for a particular language to be used as training data while at the same time allowing the model to prefer training…

Computation and Language · Computer Science 2020-05-05 Joachim Wagner , James Barry , Jennifer Foster

On Efficiently Acquiring Annotations for Multilingual Models

When tasked with supporting multiple languages for a given problem, two approaches have arisen: training a model for each language with the annotation budget divided equally among them, and training on a high-resource language followed by…

Computation and Language · Computer Science 2022-04-05 Joel Ruben Antony Moniz , Barun Patra , Matthew R. Gormley

UDapter: Language Adaptation for Truly Universal Dependency Parsing

Recent advances in multilingual dependency parsing have brought the idea of a truly universal parser closer to reality. However, cross-language interference and restrained model capacity remain major obstacles. To address this, we propose a…

Computation and Language · Computer Science 2020-10-07 Ahmet Üstün , Arianna Bisazza , Gosse Bouma , Gertjan van Noord

Multitask Pointer Network for Multi-Representational Parsing

We propose a transition-based approach that, by training a single model, can efficiently parse any input sentence with both constituent and dependency trees, supporting both continuous/projective and discontinuous/non-projective syntactic…

Computation and Language · Computer Science 2022-12-26 Daniel Fernández-González , Carlos Gómez-Rodríguez

Lessons learned in multilingual grounded language learning

Recent work has shown how to learn better visual-semantic embeddings by leveraging image descriptions in more than one language. Here, we investigate in detail which conditions affect the performance of this type of grounded language…

Computation and Language · Computer Science 2018-09-21 Ákos Kádár , Desmond Elliott , Marc-Alexandre Côté , Grzegorz Chrupała , Afra Alishahi

Are Multilingual Models Effective in Code-Switching?

Multilingual language models have shown decent performance in multilingual and cross-lingual natural language understanding tasks. However, the power of these multilingual models in code-switching tasks has not been fully explored. In this…

Computation and Language · Computer Science 2021-03-25 Genta Indra Winata , Samuel Cahyawijaya , Zihan Liu , Zhaojiang Lin , Andrea Madotto , Pascale Fung

Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages

Multilingual language models have recently gained attention as a promising solution for representing multiple languages in a single model. In this paper, we propose new criteria to evaluate the quality of lexical representation and…

Computation and Language · Computer Science 2023-05-30 Tomasz Limisiewicz , Jiří Balhar , David Mareček

One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech

We introduce an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation and produces natural-sounding multilingual speech using more languages and less training data than previous…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-04 Tomáš Nekvinda , Ondřej Dušek