English
Related papers

Related papers: data2lang2vec: Data Driven Typological Features Co…

200 papers

Linguistic typology aims to capture structural and semantic variation across the world's languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that…

Computation and Language · Computer Science 2020-10-28 Edoardo Maria Ponti , Helen O'Horan , Yevgeni Berzak , Ivan Vulić , Roi Reichart , Thierry Poibeau , Ekaterina Shutova , Anna Korhonen

Cross-lingual transfer learning is an invaluable tool for overcoming data scarcity, yet selecting a suitable transfer language remains a challenge. The precise roles of linguistic typology, training data, and model architecture in transfer…

Computation and Language · Computer Science 2025-03-27 Enora Rice , Ali Marashian , Hannah Haynie , Katharina von der Wense , Alexis Palmer

The use of linguistic typological resources in natural language processing has been steadily gaining more popularity. It has been observed that the use of typological information, often combined with distributed language representations,…

Computation and Language · Computer Science 2020-05-06 Alexander Gutkin , Tatiana Merkulova , Martin Jansche

Despite major advances in multilingual modeling, large quality disparities persist across languages. Besides the obvious impact of uneven training resources, typological properties have also been proposed to determine the intrinsic…

Computation and Language · Computer Science 2026-02-04 Vitalii Hirak , Jaap Jumelet , Arianna Bisazza

Syntactic annotation of corpora in the form of part-of-speech (POS) tags is a key requirement for both linguistic research and subsequent automated natural language processing (NLP) tasks. This problem is commonly tackled using machine…

Computation and Language · Computer Science 2024-10-30 Stefan Heid , Marcel Wever , Eyke Hüllermeier

One central mystery of neural NLP is what neural models "know" about their subject matter. When a neural machine translation system learns to translate from one language to another, does it learn the syntax or semantics of the languages?…

Computation and Language · Computer Science 2017-08-01 Chaitanya Malaviya , Graham Neubig , Patrick Littell

While information from the field of linguistic typology has the potential to improve performance on NLP tasks, reliable typological data is a prerequisite. Existing typological databases, including WALS and Grambank, suffer from…

Computation and Language · Computer Science 2024-02-05 Emi Baylor , Esther Ploeger , Johannes Bjerva

Part-of-speech (POS) tagging is considered as one of the basic but necessary tools which are required for many Natural Language Processing (NLP) applications such as word sense disambiguation, information retrieval, information processing,…

Computation and Language · Computer Science 2020-01-13 Ibrahim Gashaw , H L. Shashirekha

Linguistic resources such as part-of-speech (POS) tags have been extensively used in statistical machine translation (SMT) frameworks and have yielded better performances. However, usage of such linguistic annotations in neural machine…

Computation and Language · Computer Science 2017-08-04 Jan Niehues , Eunah Cho

While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind. To get us closer to general self-supervised…

Machine Learning · Computer Science 2022-10-27 Alexei Baevski , Wei-Ning Hsu , Qiantong Xu , Arun Babu , Jiatao Gu , Michael Auli

Typologically diverse benchmarks are increasingly created to track the progress achieved in multilingual NLP. Linguistic diversity of these data sets is typically measured as the number of languages or language families included in the…

Computation and Language · Computer Science 2024-04-17 Tanja Samardzic , Ximena Gutierrez , Christian Bentz , Steven Moran , Olga Pelloni

Morphological analysis involves predicting the syntactic traits of a word (e.g. {POS: Noun, Case: Acc, Gender: Fem}). Previous work in morphological tagging improves performance for low-resource languages (LRLs) through cross-lingual…

Computation and Language · Computer Science 2018-07-12 Chaitanya Malaviya , Matthew R. Gormley , Graham Neubig

Morphosyntactic lexicons and word vector representations have both proven useful for improving the accuracy of statistical part-of-speech taggers. Here we compare the performances of four systems on datasets covering 16 languages, two of…

Computation and Language · Computer Science 2016-08-10 Benoît Sagot

There are two main methodologies for constructing the knowledge base of a natural language analyser: the linguistic and the data-driven. Recent state-of-the-art part-of-speech taggers are based on the data-driven approach. Because of the…

cmp-lg · Computer Science 2016-08-31 Atro Voutilainen

Part-of-speech (POS) tagging plays an important role in Natural Language Processing (NLP). Its applications can be found in many NLP tasks such as named entity recognition, syntactic parsing, dependency parsing and text chunking. In the…

Computation and Language · Computer Science 2022-06-15 Tuan-Phong Nguyen , Quoc-Tuan Truong , Xuan-Nam Nguyen , Anh-Cuong Le

We show how to predict the basic word-order facts of a novel language given only a corpus of part-of-speech (POS) sequences. We predict how often direct objects follow their verbs, how often adjectives follow their nouns, and in general the…

Computation and Language · Computer Science 2017-10-12 Dingquan Wang , Jason Eisner

The performance of multilingual pretrained models is highly dependent on the availability of monolingual or parallel text present in a target language. Thus, the majority of the world's languages cannot benefit from recent progress in NLP…

Computation and Language · Computer Science 2022-04-07 Xinyi Wang , Sebastian Ruder , Graham Neubig

Developing an automatic part-of-speech (POS) tagging for any new language is considered a necessary step for further computational linguistics methodology beyond tagging, like chunking and parsing, to be fully applied to the language. Many…

Computation and Language · Computer Science 2021-10-12 Onyenwe Ikechukwu , Onyedikachukwu Ikechukwu-Onyenwe , Onyedinma Ebele

In the pursuit of supporting more languages around the world, tools that characterize properties of languages play a key role in expanding the existing multilingual NLP research. In this study, we focus on a widely used typological…

Computation and Language · Computer Science 2024-05-21 Hasti Toossi , Guo Qing Huai , Jinyu Liu , Eric Khiu , A. Seza Doğruöz , En-Shiun Annie Lee

Lexical resources are crucial for cross-linguistic analysis and can provide new insights into computational models for natural language learning. Here, we present an advanced database for comparative studies of words with multiple meanings,…

Computation and Language · Computer Science 2025-08-22 Annika Tjuka , Robert Forkel , Christoph Rzymski , Johann-Mattis List
‹ Prev 1 2 3 10 Next ›