Mathieu Constant

Conversion of Lexicon-Grammar tables to LMF. Application to French

We describe the first experiment of conversion of Lexicon-Grammar tables for French verbs into the Lexical Markup Framework (LMF) format. The Lexicon-Grammar of the French language is currently one of the major sources of lexical and…

Computation and Language · Computer Science 2026-05-15 Eric Laporte , Elsa Tolone , Mathieu Constant

TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models

Large Language Models (LLMs) have proven efficient in giving definition-type answers to user input queries. While for humans giving various types of answers, such as examples and paraphrases, is an easy task, LLMs struggle to provide…

Computation and Language · Computer Science 2025-12-01 Ioana Buhnila , Aman Sinha , Mathieu Constant

ImmunoFOMO: Are Language Models missing what oncologists see?

Language models (LMs) capabilities have grown with a fast pace over the past decade leading researchers in various disciplines, such as biomedical research, to increasingly explore the utility of LMs in their day-to-day applications. Domain…

Computation and Language · Computer Science 2025-06-16 Aman Sinha , Bogdan-Valentin Popescu , Xavier Coubez , Marianne Clausel , Mathieu Constant

No Imputation Needed: A Switch Approach to Irregularly Sampled Time Series

Modeling irregularly-sampled time series (ISTS) is challenging because of missing values. Most existing methods focus on handling ISTS by converting irregularly sampled data into regularly sampled data via imputation. These models assume an…

Artificial Intelligence · Computer Science 2024-08-21 Rohit Agarwal , Aman Sinha , Ayan Vishwakarma , Xavier Coubez , Marianne Clausel , Mathieu Constant , Alexander Horsch , Dilip K. Prasad

Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models

Recent surge in the accessibility of large language models (LLMs) to the general population can lead to untrackable use of such models for medical-related recommendations. Language generation via LLMs models has two key problems: firstly,…

Computation and Language · Computer Science 2024-07-24 Ioana Buhnila , Aman Sinha , Mathieu Constant

Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification?

The success of pretrained language models (PLMs) across a spate of use-cases has led to significant investment from the NLP community towards building domain-specific foundational models. On the other hand, in mission critical settings such…

Computation and Language · Computer Science 2024-07-18 Aman Sinha , Timothee Mickus , Marianne Clausel , Mathieu Constant , Xavier Coubez

How to Dissect a Muppet: The Structure of Transformer Embedding Spaces

Pretrained embeddings based on the Transformer architecture have taken the NLP community by storm. We show that they can mathematically be reframed as a sum of vector factors and showcase how to use this reframing to study the impact of…

Computation and Language · Computer Science 2022-06-09 Timothee Mickus , Denis Paperno , Mathieu Constant

Semeval-2022 Task 1: CODWOE -- Comparing Dictionaries and Word Embeddings

Word embeddings have advanced the state of the art in NLP across numerous tasks. Understanding the contents of dense neural representations is of utmost interest to the computational semantics community. We propose to focus on relating…

Computation and Language · Computer Science 2022-05-30 Timothee Mickus , Kees van Deemter , Mathieu Constant , Denis Paperno

A Game Interface to Study Semantic Grounding in Text-Based Models

Can language models learn grounded representations from text distribution alone? This question is both central and recurrent in natural language processing; authors generally agree that grounding requires more than textual distribution. We…

Computation and Language · Computer Science 2021-08-18 Timothee Mickus , Mathieu Constant , Denis Paperno

What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

Contextualized word embeddings, i.e. vector representations for words in context, are naturally seen as an extension of previous noncontextual distributional semantic models. In this work, we focus on BERT, a deep neural network that…

Computation and Language · Computer Science 2020-05-11 Timothee Mickus , Denis Paperno , Mathieu Constant , Kees van Deemter

Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling

Defining words in a textual context is a useful task both for practical purposes and for gaining insight into distributed word representations. Building on the distributional hypothesis, we argue here that the most natural formalization of…

Computation and Language · Computer Science 2019-11-14 Timothee Mickus , Denis Paperno , Mathieu Constant