Patrick Cardinal

Where Do Backdoors Live? A Component-Level Analysis of Backdoor Propagation in Speech Language Models

Speech language models (SLMs) are systems of systems: independent components that unite to achieve a common goal. Despite their heterogeneous nature, SLMs are often studied end-to-end; how information flows through the pipeline remains…

Computation and Language · Computer Science 2026-04-08 Alexandrine Fortier , Thomas Thebaud , Jesús Villalba , Najim Dehak , Patrick Cardinal , Peter West

Weakly Supervised Learning for Facial Affective Behavior Analysis : A Review

Recent advances in deep learning (DL) and computational capacity have enabled facial affective behavior analysis (FABA) to progress from static images captured in controlled settings to fine-grained analysis of facial expressions in…

Computer Vision and Pattern Recognition · Computer Science 2026-03-30 R. Gnana Praveen , Patrick Cardinal , Eric Granger

Multi-Target Backdoor Attacks Against Speaker Recognition

In this work, we propose a multi-target backdoor attack against speaker identification using position-independent clicking sounds as triggers. Unlike previous single-target approaches, our method targets up to 50 speakers simultaneously,…

Sound · Computer Science 2025-10-10 Alexandrine Fortier , Sonal Joshi , Thomas Thebaud , Jesús Villalba , Najim Dehak , Patrick Cardinal

Automatic Proficiency Assessment in L2 English Learners

Second language proficiency (L2) in English is usually perceptually evaluated by English teachers or expert evaluators, with the inherent intra- and inter-rater variability. This paper explores deep learning techniques for comprehensive L2…

Computation and Language · Computer Science 2025-05-06 Armita Mohammadi , Alessandro Lameiras Koerich , Laureano Moro-Velazquez , Patrick Cardinal

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Multimodal emotion recognition has recently gained much attention since it can leverage diverse and complementary relationships over multiple modalities (e.g., audio, visual, biosignals, etc.), and can provide some robustness to noisy…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 R. Gnana Praveen , Wheidima Carneiro de Melo , Nasib Ullah , Haseeb Aslam , Osama Zeeshan , Théo Denorme , Marco Pedersoli , Alessandro Koerich , Simon Bacon , Patrick Cardinal , Eric Granger

Cross Attentional Audio-Visual Fusion for Dimensional Emotion Recognition

Multimodal analysis has recently drawn much interest in affective computing, since it can improve the overall accuracy of emotion recognition over isolated uni-modal approaches. The most effective techniques for multimodal emotion…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 R. Gnana Praveen , Eric Granger , Patrick Cardinal

Deep Domain Adaptation for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labelled Videos

Estimation of pain intensity from facial expressions captured in videos has an immense potential for health care applications. Given the challenges related to subjective variations of facial expressions, and operational capture conditions,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 R. Gnana Praveen , Eric Granger , Patrick Cardinal

Deep Weakly-Supervised Domain Adaptation for Pain Localization in Videos

Automatic pain assessment has an important potential diagnostic value for populations that are incapable of articulating their pain experiences. As one of the dominating nonverbal channels for eliciting pain expression events, facial…

Computer Vision and Pattern Recognition · Computer Science 2024-07-09 R. Gnana Praveen , Eric Granger , Patrick Cardinal

Deep DA for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labeled Videos

Automatic estimation of pain intensity from facial expressions in videos has an immense potential in health care applications. However, domain adaptation (DA) is needed to alleviate the problem of domain shifts that typically occurs between…

Computer Vision and Pattern Recognition · Computer Science 2023-09-13 Gnana Praveen R , Eric Granger , Patrick Cardinal

Recursive Joint Attention for Audio-Visual Fusion in Regression based Emotion Recognition

In video-based emotion recognition (ER), it is important to effectively leverage the complementary relationship among audio (A) and visual (V) modalities, while retaining the intra-modal characteristics of individual modalities. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 R Gnana Praveen , Eric Granger , Patrick Cardinal

RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks

This paper introduces a new synthesis-based defense algorithm for counteracting with a varieties of adversarial attacks developed for challenging the performance of the cutting-edge speech-to-text transcription systems. Our algorithm…

Sound · Computer Science 2022-10-26 Mohammad Esmaeilpour , Nourhene Chaalia , Patrick Cardinal

Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention

Automatic emotion recognition (ER) has recently gained lot of interest due to its potential in many real-world applications. In this context, multimodal approaches have been shown to improve performance (over unimodal approaches) by…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 R Gnana Praveen , Eric Granger , Patrick Cardinal

RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis

This paper introduces a novel generative adversarial network (GAN) for synthesizing large-scale tabular databases which contain various features such as continuous, discrete, and binary. Technically, our GAN belongs to the category of…

Machine Learning · Computer Science 2022-05-25 Mohammad Esmaeilpour , Nourhene Chaalia , Adel Abusitta , Francois-Xavier Devailly , Wissem Maazoun , Patrick Cardinal

Named Entity Recognition for Audio De-Identification

Data anonymization is often a task carried out by humans. Automating it would reduce the cost and time required to complete this task. This paper presents a pipeline to automate the anonymization of audio data in French. We propose a…

Sound · Computer Science 2022-04-28 Guillaume Baril , Patrick Cardinal , Alessandro Lameiras Koerich

From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks

This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network, namely…

Sound · Computer Science 2022-04-15 Mohammad Esmaeilpour , Patrick Cardinal , Alessandro Lameiras Koerich

Bi-Discriminator Class-Conditional Tabular GAN

This paper introduces a bi-discriminator GAN for synthesizing tabular datasets containing continuous, binary, and discrete columns. Our proposed approach employs an adapted preprocessing scheme and a novel conditional term for the generator…

Machine Learning · Computer Science 2021-12-06 Mohammad Esmaeilpour , Nourhene Chaalia , Adel Abusitta , Francois-Xavier Devailly , Wissem Maazoun , Patrick Cardinal

Cyclic Defense GAN Against Speech Adversarial Attacks

This paper proposes a new defense approach for counteracting state-of-the-art white and black-box adversarial attack algorithms. Our approach fits into the implicit reactive defense algorithm category since it does not directly manipulate…

Sound · Computer Science 2021-08-24 Mohammad Esmaeilpour , Patrick Cardinal , Alessandro Lameiras Koerich

Towards Robust Speech-to-Text Adversarial Attack

This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo. Our approach is based on developing an extension for the conventional distortion condition…

Sound · Computer Science 2021-03-16 Mohammad Esmaeilpour , Patrick Cardinal , Alessandro Lameiras Koerich

Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems

This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems. The proposed defense algorithm has four major steps. First, we represent speech signals with 2D spectrograms…

Sound · Computer Science 2021-03-16 Mohammad Esmaeilpour , Patrick Cardinal , Alessandro Lameiras Koerich

Class-Conditional Defense GAN Against End-to-End Speech Attacks

In this paper we propose a novel defense approach against end-to-end adversarial attacks developed to fool advanced speech-to-text systems such as DeepSpeech and Lingvo. Unlike conventional defense approaches, the proposed approach does not…

Sound · Computer Science 2021-02-23 Mohammad Esmaeilpour , Patrick Cardinal , Alessandro Lameiras Koerich