English
Related papers

Related papers: Continual Learning Under Language Shift

200 papers

The capacity and effectiveness of pre-trained multilingual models (MLMs) for zero-shot cross-lingual transfer is well established. However, phenomena of positive or negative transfer, and the effect of language choice still need to be fully…

Computation and Language · Computer Science 2024-04-01 Fahim Faisal , Antonios Anastasopoulos

Most Transformer language models are primarily pretrained on English text, limiting their use for other languages. As the model sizes grow, the performance gap between English and other languages with fewer compute and data resources…

Computation and Language · Computer Science 2023-01-24 Malte Ostendorff , Georg Rehm

Why do some languages like Czech permit free word order, while others like English do not? We address this question by pretraining transformer language models on a spectrum of synthetic word-order variants of natural languages. We observe…

Computation and Language · Computer Science 2026-03-23 Jonas Mayer Martins , Jaap Jumelet , Viola Priesemann , Lisa Beinborn

Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-the-art performance in NLP. Still, it remains unclear how this approach should be applied for unseen languages that…

Computation and Language · Computer Science 2021-04-20 Benjamin Muller , Antonis Anastasopoulos , Benoît Sagot , Djamé Seddah

Recent studies in zero-shot cross-lingual learning using multilingual models have falsified the previous hypothesis that shared vocabulary and joint pre-training are the keys to cross-lingual generalization. Inspired by this advancement, we…

Computation and Language · Computer Science 2022-05-20 Evangelia Gogoulou , Ariel Ekgren , Tim Isbister , Magnus Sahlgren

English pretrained language models, which make up the backbone of many modern NLP systems, require huge amounts of unlabeled training data. These models are generally presented as being trained only on English text but have been found to…

Computation and Language · Computer Science 2022-11-18 Terra Blevins , Luke Zettlemoyer

Multilingual pretraining has been a successful solution to the challenges posed by the lack of resources for languages. These models can transfer knowledge to target languages with minimal or no examples. Recent research suggests that…

Computation and Language · Computer Science 2024-04-15 Leandro Rodrigues de Souza , Thales Sales Almeida , Roberto Lotufo , Rodrigo Nogueira

Pretrained multilingual models enable zero-shot learning even for unseen languages, and that performance can be further improved via adaptation prior to finetuning. However, it is unclear how the number of pretraining languages influences a…

Computation and Language · Computer Science 2022-03-22 Yoshinari Fujinuma , Jordan Boyd-Graber , Katharina Kann

Language is constantly changing and evolving, leaving language models to become quickly outdated. Consequently, we should continuously update our models with new data to expose them to new events and facts. However, that requires additional…

Computation and Language · Computer Science 2023-05-05 Giuseppe Attanasio , Debora Nozza , Federico Bianchi , Dirk Hovy

Crosslingual transfer is crucial to contemporary language models' multilingual capabilities, but how it occurs is not well understood. We ask what happens to a monolingual language model when it begins to be trained on a second language.…

Computation and Language · Computer Science 2025-06-05 Catherine Arnett , Tyler A. Chang , James A. Michaelov , Benjamin K. Bergen

Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with…

Computation and Language · Computer Science 2023-05-23 Seraphina Goldfarb-Tarrant , Björn Ross , Adam Lopez

Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this…

Computation and Language · Computer Science 2022-11-02 Rochelle Choenni , Dan Garrette , Ekaterina Shutova

Continual learning (CL) in large language models (LLMs) is an evolving domain that focuses on developing efficient and sustainable training strategies to adapt models to emerging knowledge and achieve robustness in dynamic environments. Our…

Computation and Language · Computer Science 2025-02-13 Çağatay Yıldız , Nishaanth Kanna Ravichandran , Nitin Sharma , Matthias Bethge , Beyza Ermis

As the application space of language models continues to evolve, a natural question to ask is how we can quickly adapt models to new tasks. We approach this classic question from a continual learning perspective, in which we aim to continue…

Computation and Language · Computer Science 2023-07-13 Adam Fisch , Amal Rannen-Triki , Razvan Pascanu , Jörg Bornschein , Angeliki Lazaridou , Elena Gribovskaya , Marc'Aurelio Ranzato

In order for large language models to be useful across the globe, they are fine-tuned to follow instructions on multilingual data. Despite the ubiquity of such post-training, a clear understanding of the dynamics that enable cross-lingual…

Computation and Language · Computer Science 2025-04-24 Luisa Shimabucoro , Ahmet Ustun , Marzieh Fadaee , Sebastian Ruder

Multilingual large language models are designed, claimed, and expected to cater to speakers of varied languages. We hypothesise that the current practices of fine-tuning and evaluating these models may not perfectly align with this…

Computation and Language · Computer Science 2024-09-27 Pinzhen Chen , Simon Yu , Zhicheng Guo , Barry Haddow

Large language models (LLMs) are typically trained on shuffled corpora, yielding models whose knowledge is frozen at train time and whose temporal grounding remains poorly understood. In this work, we study the impact of pre-training…

Computation and Language · Computer Science 2026-05-26 Hippolyte Pilchen , Romain Fabre , Franck Signe Talla , Patrick Perez , Edouard Grave

Pretraining language models on formal language can improve their acquisition of natural language. Which features of the formal language impart an inductive bias that leads to effective transfer? Drawing on insights from linguistics and…

Computation and Language · Computer Science 2025-05-28 Michael Y. Hu , Jackson Petty , Chuan Shi , William Merrill , Tal Linzen

The emergent cross-lingual transfer seen in multilingual pretrained models has sparked significant interest in studying their behavior. However, because these analyses have focused on fully trained multilingual models, little is known about…

Computation and Language · Computer Science 2022-10-25 Terra Blevins , Hila Gonen , Luke Zettlemoyer

When we transfer a pretrained language model to a new language, there are many axes of variation that change at once. To disentangle the impact of different factors like syntactic similarity and vocabulary similarity, we propose a set of…

Computation and Language · Computer Science 2024-01-25 Zhengxuan Wu , Alex Tamkin , Isabel Papadimitriou
‹ Prev 1 2 3 10 Next ›