Related papers: Multi-Classifier Interactive Learning for Ambiguou…

Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition

Categorical speech emotion recognition is typically performed as a sequence-to-label problem, i.e., to determine the discrete emotion label of the input utterance as a whole. One of the main challenges in practice is that most of the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-18 Shuiyang Mao , P. C. Ching , C. -C. Jay Kuo , Tan Lee

Multi-label Class Incremental Emotion Decoding with Augmented Emotional Semantics Learning

Emotion decoding plays an important role in affective human-computer interaction. However, previous studies ignored the dynamic real-world scenario, where human experience a blend of multiple emotions which are incrementally integrated into…

Artificial Intelligence · Computer Science 2024-06-03 Kaicheng Fu , Changde Du , Xiaoyu Chen , Jie Peng , Huiguang He

Multimodal Speech Emotion Recognition and Ambiguity Resolution

Identifying emotion from speech is a non-trivial task pertaining to the ambiguous definition of emotion itself. In this work, we adopt a feature-engineering based approach to tackle the task of speech emotion recognition. Formalizing our…

Machine Learning · Computer Science 2019-04-15 Gaurav Sahu

Reasoning under Ambiguity: Uncertainty-Aware Multilingual Emotion Classification under Partial Supervision

Contemporary knowledge-based systems increasingly rely on multilingual emotion identification to support intelligent decision-making, yet they face major challenges due to emotional ambiguity and incomplete supervision. Emotion recognition…

Computation and Language · Computer Science 2026-02-12 Md. Mithun Hossain , Mashary N. Alrasheedy , Nirban Bhowmick , Shamim Forhad , Md. Shakil Hossain , Sudipto Chaki , Md Shafiqul Islam

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition

Recognizing emotions from speech is a daunting task due to the subtlety and ambiguity of expressions. Traditional speech emotion recognition (SER) systems, which typically rely on a singular, precise emotion label, struggle with this…

Sound · Computer Science 2024-08-02 Haoqin Sun , Shiwan Zhao , Xiangyu Kong , Xuechen Wang , Hui Wang , Jiaming Zhou , Yong Qin

Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching

Automatic emotion recognition is an active research topic with wide range of applications. Due to the high manual annotation cost and inevitable label ambiguity, the development of emotion recognition dataset is limited in both scale and…

Audio and Speech Processing · Electrical Eng. & Systems 2020-09-08 Jingjun Liang , Ruichen Li , Qin Jin

Dynamic Prompt Adjustment for Multi-Label Class-Incremental Learning

Significant advancements have been made in single label incremental learning (SLCIL),yet the more practical and challenging multi label class incremental learning (MLCIL) remains understudied. Recently,visual language models such as CLIP…

Computer Vision and Pattern Recognition · Computer Science 2025-01-06 Haifeng Zhao , Yuguang Jin , Leilei Ma

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that datasets like ImageNet are weakly labeled since images with multiple object classes present are…

Computer Vision and Pattern Recognition · Computer Science 2021-11-25 Sai Rajeswar , Pau Rodriguez , Soumye Singhal , David Vazquez , Aaron Courville

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

The subjective perception of emotion leads to inconsistent labels from human annotators. Typically, utterances lacking majority-agreed labels are excluded when training an emotion classifier, which cause problems when encountering ambiguous…

Computation and Language · Computer Science 2024-10-14 Wen Wu , Bo Li , Chao Zhang , Chung-Cheng Chiu , Qiujia Li , Junwen Bai , Tara N. Sainath , Philip C. Woodland

Curriculum Learning for Speech Emotion Recognition from Crowdsourced Labels

This study introduces a method to design a curriculum for machine-learning to maximize the efficiency during the training process of deep neural networks (DNNs) for speech emotion recognition. Previous studies in other machine-learning…

Audio and Speech Processing · Electrical Eng. & Systems 2022-03-17 Reza Lotfian , Carlos Busso

AmbER$^2$: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text

Emotion recognition is inherently ambiguous, with uncertainty arising both from rater disagreement and from discrepancies across modalities such as speech and text. There is growing interest in modeling rater ambiguity using label…

Audio and Speech Processing · Electrical Eng. & Systems 2026-01-27 Jingyao Wu , Grace Lin , Yinuo Song , Rosalind Picard

An Empirical Study and Improvement for Speech Emotion Recognition

Multimodal speech emotion recognition aims to detect speakers' emotions from audio and text. Prior works mainly focus on exploiting advanced networks to model and fuse different modality information to facilitate performance, while…

Computation and Language · Computer Science 2023-04-11 Zhen Wu , Yizhe Lu , Xinyu Dai

Memory-guided Prototypical Co-occurrence Learning for Mixed Emotion Recognition

Emotion recognition from multi-modal physiological and behavioral signals plays a pivotal role in affective computing, yet most existing models remain constrained to the prediction of singular emotions in controlled laboratory settings.…

Machine Learning · Computer Science 2026-02-25 Ming Li , Yong-Jin Liu , Fang Liu , Huankun Sheng , Yeying Fan , Yixiang Wei , Minnan Luo , Weizhan Zhang , Wenping Wang

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Sentiment classification typically relies on a large amount of labeled data. In practice, the availability of labels is highly imbalanced among different languages, e.g., more English texts are labeled than texts in any other languages,…

Information Retrieval · Computer Science 2019-03-26 Zhenpeng Chen , Sheng Shen , Ziniu Hu , Xuan Lu , Qiaozhu Mei , Xuanzhe Liu

MMER: Multimodal Multi-task Learning for Speech Emotion Recognition

In this paper, we propose MMER, a novel Multimodal Multi-task learning approach for Speech Emotion Recognition. MMER leverages a novel multimodal network based on early-fusion and cross-modal self-attention between text and acoustic…

Computation and Language · Computer Science 2023-06-06 Sreyan Ghosh , Utkarsh Tyagi , S Ramaneswaran , Harshvardhan Srivastava , Dinesh Manocha

Addressing Ambiguity of Emotion Labels Through Meta-Learning

Emotion labels in emotion recognition corpora are highly noisy and ambiguous, due to the annotators' subjective perception of emotions. Such ambiguity may introduce errors in automatic classification and affect the overall performance. We…

Audio and Speech Processing · Electrical Eng. & Systems 2019-11-11 Takuya Fujioka , Dario Bertero , Takeshi Homma , Kenji Nagamatsu

Disentangling Reasoning in Large Audio-Language Models for Ambiguous Emotion Prediction

Speech emotion recognition plays an important role in various applications. However, most existing approaches predict a single emotion label, oversimplifying the inherently ambiguous nature of human emotional expression. Recent large…

Sound · Computer Science 2026-03-10 Xiaofeng Yu , Jiaheng Dong , Jean Honorio , Abhirup Ghosh , Hong Jia , Ting Dang

Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data

This paper proposes a multimodal emotion recognition system based on hybrid fusion that classifies the emotions depicted by speech utterances and corresponding images into discrete classes. A new interpretability technique has been…

Computer Vision and Pattern Recognition · Computer Science 2023-01-10 Puneet Kumar , Sarthak Malik , Balasubramanian Raman

Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition

Traditionally, in paralinguistic analysis for emotion detection from speech, emotions have been identified with discrete or dimensional (continuous-valued) labels. Accordingly, models that have been proposed for emotion detection use one or…

Sound · Computer Science 2022-11-01 Roshan Sharma , Hira Dhamyal , Bhiksha Raj , Rita Singh

Multi-Modal Emotion Recognition by Text, Speech and Video Using Pretrained Transformers

Due to the complex nature of human emotions and the diversity of emotion representation methods in humans, emotion recognition is a challenging field. In this research, three input modalities, namely text, audio (speech), and video, are…

Artificial Intelligence · Computer Science 2024-02-13 Minoo Shayaninasab , Bagher Babaali