Phillip Rust — Scifaro

EgoBabyVLM: Benchmarking Cross-Modal Learning from Naturalistic Egocentric Video Data

Children acquire language grounding with remarkable robustness from limited visuo-linguistic input in ways that surpass today's best large multimodal models. Recent research suggests current vision-language models (VLMs) trained on curated…

Machine Learning · Computer Science 2026-05-20 Dongyan Lin , Phillip Rust , Angel Villar Corrales , Alvin W. M. Tan , Mahi Luthra , Charles-Éric Saint-James , Rashel Moritz , Sheila Krogh-Jespersen , Vanessa Stark , Surya Parimi , Jiayi Shen , Youssef Benchekroun , Yosuke Higuchi , Martin Gleize , Tom Fizycki , Nicolas Hamilakis , Manel Khentout , Sho Tsuji , Balázs Kégl , Juan Pino , Michael C. Frank , Emmanuel Dupoux

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

Human infants, with only a few hundred hours of speech exposure, acquire basic units of new languages, highlighting a striking efficiency gap compared to the data-hungry self-supervised speech models. To address this gap, this paper…

Computation and Language · Computer Science 2026-04-21 Mahi Luthra , Jiayi Shen , Maxime Poli , Angelo Ortiz , Yosuke Higuchi , Youssef Benchekroun , Martin Gleize , Charles-Eric Saint-James , Dongyan Lin , Phillip Rust , Angel Villar , Surya Parimi , Vanessa Stark , Rashel Moritz , Juan Pino , Yann LeCun , Emmanuel Dupoux

Multilingual Pretraining for Pixel Language Models

Pixel language models operate directly on images of rendered text, eliminating the need for a fixed vocabulary. While these models have demonstrated strong capabilities for downstream cross-lingual transfer, multilingual pretraining remains…

Computation and Language · Computer Science 2025-12-03 Ilker Kesen , Jonas F. Lotz , Ingo Ziegler , Phillip Rust , Desmond Elliott

Trick or Neat: Adversarial Ambiguity and Language Model Evaluation

Detecting ambiguity is important for language understanding, including uncertainty estimation, humour detection, and processing garden path sentences. We assess language models' sensitivity to ambiguity by introducing an adversarial…

Computation and Language · Computer Science 2025-06-03 Antonia Karamolegkou , Oliver Eberle , Phillip Rust , Carina Kauf , Anders Søgaard

Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users

This paper explores the effectiveness of Multimodal Large Language models (MLLMs) as assistive technologies for visually impaired individuals. We conduct a user survey to identify adoption patterns and key challenges users face with such…

Human-Computer Interaction · Computer Science 2025-03-31 Antonia Karamolegkou , Malvina Nikandrou , Georgios Pantazopoulos , Danae Sanchez Villegas , Phillip Rust , Ruchira Dhar , Daniel Hershcovich , Anders Søgaard

Towards Privacy-Aware Sign Language Translation at Scale

A major impediment to the advancement of sign language translation (SLT) is data scarcity. Much of the sign language data currently available on the web cannot be used for training supervised models due to the lack of aligned captions.…

Computation and Language · Computer Science 2024-08-09 Phillip Rust , Bowen Shi , Skyler Wang , Necati Cihan Camgöz , Jean Maillard

Vision-Language Models under Cultural and Inclusive Considerations

Large vision-language models (VLMs) can assist visually impaired people by describing images from their daily lives. Current evaluation datasets may not reflect diverse cultural user backgrounds or the situational context of this use case.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 Antonia Karamolegkou , Phillip Rust , Yong Cao , Ruixiang Cui , Anders Søgaard , Daniel Hershcovich

PHD: Pixel-Based Language Modeling of Historical Documents

The digitisation of historical documents has provided historians with unprecedented research opportunities. Yet, the conventional approach to analysing historical documents involves converting them from images to text using OCR, a process…

Computation and Language · Computer Science 2023-11-07 Nadav Borenstein , Phillip Rust , Desmond Elliott , Isabelle Augenstein

Text Rendering Strategies for Pixel Language Models

Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling. However, recent approaches use text renderers that produce a large…

Computation and Language · Computer Science 2023-11-02 Jonas F. Lotz , Elizabeth Salesky , Phillip Rust , Desmond Elliott

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private,…

Computation and Language · Computer Science 2023-08-21 Phillip Rust , Anders Søgaard

Language Modelling with Pixels

Language models are defined over a finite set of inputs, which creates a vocabulary bottleneck when we attempt to scale the number of supported languages. Tackling this bottleneck results in a trade-off between what can be represented in…

Computation and Language · Computer Science 2023-04-27 Phillip Rust , Jonas F. Lotz , Emanuele Bugliarello , Elizabeth Salesky , Miryam de Lhoneux , Desmond Elliott

Challenges and Strategies in Cross-Cultural NLP

Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages. However, it is important to acknowledge that speakers and the content they…

Computation and Language · Computer Science 2022-03-21 Daniel Hershcovich , Stella Frank , Heather Lent , Miryam de Lhoneux , Mostafa Abdou , Stephanie Brandl , Emanuele Bugliarello , Laura Cabello Piqueras , Ilias Chalkidis , Ruixiang Cui , Constanza Fierro , Katerina Margatina , Phillip Rust , Anders Søgaard

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance. We study a set of nine…

Computation and Language · Computer Science 2021-06-03 Phillip Rust , Jonas Pfeiffer , Ivan Vulić , Sebastian Ruder , Iryna Gurevych

PuzzLing Machines: A Challenge on Learning From Small Data

Deep neural models have repeatedly proved excellent at memorizing surface patterns from large datasets for various ML and NLP benchmarks. They struggle to achieve human-like thinking, however, because they lack the skill of iterative…

Computation and Language · Computer Science 2020-04-29 Gözde Gül Şahin , Yova Kementchedjhieva , Phillip Rust , Iryna Gurevych