Related papers: DeMuX: Data-efficient Multilingual Learning

LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data

Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often…

Machine Learning · Computer Science 2025-11-19 Tzu-Hsuan Chou , Chun-Nan Chou

Debiased Learning from Naturally Imbalanced Pseudo-Labels

Pseudo-labels are confident predictions made on unlabeled target data by a classifier trained on labeled source data. They are widely used for adapting a model to unlabeled data, e.g., in a semi-supervised learning setting. Our key insight…

Machine Learning · Computer Science 2022-04-22 Xudong Wang , Zhirong Wu , Long Lian , Stella X. Yu

Constrained Decoding for Cross-lingual Label Projection

Zero-shot cross-lingual transfer utilizing multilingual LLMs has become a popular learning paradigm for low-resource languages with no labeled training data. However, for NLP tasks that involve fine-grained predictions on words and phrases,…

Computation and Language · Computer Science 2024-02-06 Duong Minh Le , Yang Chen , Alan Ritter , Wei Xu

Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts

Robust benchmarks are crucial for evaluating Multimodal Large Language Models (MLLMs). Yet we find that models can ace many multimodal benchmarks without strong visual understanding, instead exploiting biases, linguistic priors, and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-07 Ellis Brown , Jihan Yang , Shusheng Yang , Rob Fergus , Saining Xie

DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer

Effective cross-lingual transfer remains a critical challenge in scaling the benefits of large language models from high-resource to low-resource languages. Towards this goal, prior studies have explored many approaches to combine task…

Computation and Language · Computer Science 2025-05-22 Sona Elza Simon , Preethi Jyothi

DELULU: Discriminative Embedding Learning Using Latent Units for Speaker-Aware Self-Trained Speech Foundational Model

Self-supervised speech models have achieved remarkable success on content-driven tasks, yet they remain limited in capturing speaker-discriminative features critical for verification, diarization, and profiling applications. We introduce…

Sound · Computer Science 2026-03-26 Massa Baali , Rita Singh , Bhiksha Raj

Predictions For Pre-training Language Models

Language model pre-training has proven to be useful in many language understanding tasks. In this paper, we investigate whether it is still helpful to add the self-training method in the pre-training step and the fine-tuning step. Towards…

Computation and Language · Computer Science 2023-02-17 Tong Guo

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual…

Computation and Language · Computer Science 2020-05-25 Yaobo Liang , Nan Duan , Yeyun Gong , Ning Wu , Fenfei Guo , Weizhen Qi , Ming Gong , Linjun Shou , Daxin Jiang , Guihong Cao , Xiaodong Fan , Ruofei Zhang , Rahul Agrawal , Edward Cui , Sining Wei , Taroon Bharti , Ying Qiao , Jiun-Hung Chen , Winnie Wu , Shuguang Liu , Fan Yang , Daniel Campos , Rangan Majumder , Ming Zhou

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English…

Computation and Language · Computer Science 2022-07-19 Emanuele Bugliarello , Fangyu Liu , Jonas Pfeiffer , Siva Reddy , Desmond Elliott , Edoardo Maria Ponti , Ivan Vulić

Unsupervised Multilingual Dense Retrieval via Generative Pseudo Labeling

Dense retrieval methods have demonstrated promising performance in multilingual information retrieval, where queries and documents can be in different languages. However, dense retrievers typically require a substantial amount of paired…

Computation and Language · Computer Science 2024-03-07 Chao-Wei Huang , Chen-An Li , Tsu-Yuan Hsu , Chen-Yu Hsu , Yun-Nung Chen

Multi-task Learning for Multilingual Neural Machine Translation

While monolingual data has been shown to be useful in improving bilingual neural machine translation (NMT), effectively and efficiently leveraging monolingual data for Multilingual NMT (MNMT) systems is a less explored area. In this work,…

Computation and Language · Computer Science 2020-10-07 Yiren Wang , ChengXiang Zhai , Hany Hassan Awadalla

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such…

Computation and Language · Computer Science 2022-10-26 Ahmet Üstün , Arianna Bisazza , Gosse Bouma , Gertjan van Noord , Sebastian Ruder

Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data

Deep learning-based models, when trained in a fully-supervised manner, can be effective in performing complex image analysis tasks, although contingent upon the availability of large labeled datasets. Especially in the medical imaging…

Computer Vision and Pattern Recognition · Computer Science 2023-06-29 Ayaan Haque , Abdullah-Al-Zubaer Imran , Adam Wang , Demetri Terzopoulos

Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection

Large language models (LLMs) are very proficient text generators. We leverage this capability of LLMs to generate task-specific data via zero-shot prompting and promote cross-lingual transfer for low-resource target languages. Given…

Computation and Language · Computer Science 2024-07-16 Barah Fazili , Ashish Sunil Agrawal , Preethi Jyothi

Spoken Language Understanding on Unseen Tasks With In-Context Learning

Spoken language understanding (SLU) tasks involve diverse skills that probe the information extraction, classification and/or generation capabilities of models. In this setting, task-specific training data may not always be available. While…

Computation and Language · Computer Science 2025-10-06 Neeraj Agrawal , Sriram Ganapathy

Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification

We consider zero-shot cross-lingual transfer in legal topic classification using the recent MultiEURLEX dataset. Since the original dataset contains parallel documents, which is unrealistic for zero-shot cross-lingual transfer, we develop a…

Computation and Language · Computer Science 2022-06-09 Stratos Xenouleas , Alexia Tsoukara , Giannis Panagiotakis , Ilias Chalkidis , Ion Androutsopoulos

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

Determining an effective data mixture is a key factor in Large Language Model (LLM) pre-training, where models must balance general competence with proficiency on hard tasks such as math and code. However, identifying an optimal mixture…

Computation and Language · Computer Science 2026-05-18 Shengrui Li , Fei Zhao , Kaiyan Zhao , Jieying Ye , Haifeng Liu , Fangcheng Shi , Zheyong Xie , Yao Hu , Shaosheng Cao

The Benefits of Label-Description Training for Zero-Shot Text Classification

Pretrained language models have improved zero-shot text classification by allowing the transfer of semantic knowledge from the training data in order to classify among specific label sets in downstream tasks. We propose a simple way to…

Computation and Language · Computer Science 2023-10-24 Lingyu Gao , Debanjan Ghosh , Kevin Gimpel

ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters

We tackle the problem of zero-shot cross-lingual transfer in NLP tasks via the use of language adapters (LAs). Most of the earlier works have explored training with adapter of a single source (often English), and testing either using the…

Computation and Language · Computer Science 2023-10-26 Vipul Rathore , Rajdeep Dhingra , Parag Singla , Mausam

LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have achieved strong performance on general visual benchmarks but struggle with out-of-distribution (OOD) tasks in specialized domains such as medical imaging, where labeled data is limited and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-06 Ci-Siang Lin , Min-Hung Chen , Yu-Yang Sheng , Yu-Chiang Frank Wang