English
Related papers

Related papers: Noise Estimation Using Density Estimation for Self…

200 papers

Multimodal deep learning systems which employ multiple modalities like text, image, audio, video, etc., are showing better performance in comparison with individual modalities (i.e., unimodal) systems. Multimodal machine learning involves…

Machine Learning · Computer Science 2022-01-19 Anil Rahate , Rahee Walambe , Sheela Ramanna , Ketan Kotecha

A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation…

Computer Vision and Pattern Recognition · Computer Science 2022-11-08 Abhinav Joshi , Naman Gupta , Jinang Shah , Binod Bhattarai , Ashutosh Modi , Danail Stoyanov

Multimodal learning systems often face substantial uncertainty due to noisy data, low-quality labels, and heterogeneous modality characteristics. These issues become especially critical in human-computer interaction settings, where data…

Artificial Intelligence · Computer Science 2025-11-21 Hyo-Jeong Jang

Representation Learning is a significant and challenging task in multimodal learning. Effective modality representations should contain two parts of characteristics: the consistency and the difference. Due to the unified multimodal…

Computation and Language · Computer Science 2021-02-10 Wenmeng Yu , Hua Xu , Ziqi Yuan , Jiele Wu

Multimodal learning, which aims to understand and analyze information from multiple modalities, has achieved substantial progress in the supervised regime in recent years. However, the heavy dependence on data paired with expensive human…

Machine Learning · Computer Science 2024-08-19 Yongshuo Zong , Oisin Mac Aodha , Timothy Hospedales

Recently, video recognition is emerging with the help of multi-modal learning, which focuses on integrating distinct modalities to improve the performance or robustness of the model. Although various multi-modal learning methods have been…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Haochen Han , Qinghua Zheng , Minnan Luo , Kaiyao Miao , Feng Tian , Yan Chen

Improving model robustness against potential modality noise, as an essential step for adapting multimodal models to real-world applications, has received increasing attention among researchers. For Multimodal Sentiment Analysis (MSA), there…

Multimedia · Computer Science 2022-11-28 Huisheng Mao , Baozheng Zhang , Hua Xu , Ziqi Yuan , Yihe Liu

Multimodal Language Analysis is a demanding area of research, since it is associated with two requirements: combining different modalities and capturing temporal information. During the last years, several works have been proposed in the…

Computation and Language · Computer Science 2022-01-10 Panagiotis Koromilas , Theodoros Giannakopoulos

Current meta-learning approaches focus on learning functional representations of relationships between variables, i.e. on estimating conditional expectations in regression. In many applications, however, we are faced with conditional…

Machine Learning · Statistics 2021-02-25 Jean-Francois Ton , Lucian Chan , Yee Whye Teh , Dino Sejdinovic

Multimodal emotion recognition (MER) in practical scenarios is significantly challenged by the presence of missing or incomplete data across different modalities. To overcome these challenges, researchers have aimed to simulate incomplete…

Computer Vision and Pattern Recognition · Computer Science 2024-09-20 Qi Fan , Haolin Zuo , Rui Liu , Zheng Lian , Guanglai Gao

Representation learning is fundamental to modern machine learning, powering applications such as text retrieval and multimodal understanding. However, learning robust and generalizable representations remains challenging. While prior work…

Machine Learning · Computer Science 2025-11-27 Jiaoyang Li , Jun Fang , Tianhao Gao , Xiaohui Zhang , Zhiyuan Liu , Chao Liu , Pengzhang Liu , Qixia Jiang

Medical image denoising is considered among the most challenging vision tasks. Despite the real-world implications, existing denoising methods have notable drawbacks as they often generate visual artifacts when applied to heterogeneous…

Image and Video Processing · Electrical Eng. & Systems 2025-03-11 S M A Sharif , Rizwan Ali Naqvi , Woong-Kee Loh

In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: how modalities combine to provide new task-relevant information that was not…

A major challenge in multimodal learning is the presence of noise within individual modalities. This noise inherently affects the resulting multimodal representations, especially when these representations are obtained through explicit…

Computer Vision and Pattern Recognition · Computer Science 2025-08-25 Mohammad Zia Ur Rehman , Devraj Raghuvanshi , Umang Jain , Shubhi Bansal , Nagendra Kumar

Non-intrusive speech quality assessment is a crucial operation in multimedia applications. The scarcity of annotated data and the lack of a reference signal represent some of the main challenges for designing efficient quality assessment…

Audio and Speech Processing · Electrical Eng. & Systems 2021-08-20 Alessandro Ragano , Emmanouil Benetos , Andrew Hines

The conventional success of textual classification relies on annotated data, and the new paradigm of pre-trained language models (PLMs) still requires a few labeled data for downstream tasks. However, in real-world applications, label noise…

Computation and Language · Computer Science 2022-10-14 Dan Qiao , Chenchen Dai , Yuyang Ding , Juntao Li , Qiang Chen , Wenliang Chen , Min Zhang

Multi-modal learning, particularly among imaging and linguistic modalities, has made amazing strides in many high-level fundamental visual understanding problems, ranging from language grounding to dense event captioning. However, much of…

Computer Vision and Pattern Recognition · Computer Science 2019-10-28 Tanzila Rahman , Bicheng Xu , Leonid Sigal

Learning multimodal representations involves integrating information from multiple heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world applications in multimedia, affective computing, robotics,…

In this paper we explore audiovisual emotion recognition under noisy acoustic conditions with a focus on speech features. We attempt to answer the following research questions: (i) How does speech emotion recognition perform on noisy data?…

Sound · Computer Science 2021-03-03 Michael Neumann , Ngoc Thang Vu

Falsely annotated samples, also known as noisy labels, can significantly harm the performance of deep learning models. Two main approaches for learning with noisy labels are global noise estimation and data filtering. Global noise…

Machine Learning · Computer Science 2025-07-31 Yuval Grinberg , Nimrod Harel , Jacob Goldberger , Ofir Lindenbaum
‹ Prev 1 2 3 10 Next ›