Related papers: Multimodal Sparse Coding for Event Detection

Multimodal sparse representation learning and applications

Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between…

Machine Learning · Computer Science 2016-03-03 Miriam Cha , Youngjune Gwon , H. T. Kung

A Shared Encoder Approach to Multimodal Representation Learning

Multimodal representation learning has demonstrated remarkable potential in enabling models to process and integrate diverse data modalities, such as text and images, for improved understanding and performance. While the medical domain can…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Shuvendu Roy , Franklin Ogidi , Ali Etemad , Elham Dolatabadi , Arash Afkanpour

Cross-Modal Discrete Representation Learning

Recent advances in representation learning have demonstrated an ability to represent information from different modalities such as video, text, and audio in a single high-level embedding vector. In this work we present a self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2021-06-11 Alexander H. Liu , SouYoung Jin , Cheng-I Jeff Lai , Andrew Rouditchenko , Aude Oliva , James Glass

Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data

Due to the ever-growing diversity of the data source, multi-modality feature learning has attracted more and more attention. However, most of these methods are designed by jointly learning feature representation from multi-modalities that…

Computer Vision and Pattern Recognition · Computer Science 2020-06-09 Danfeng Hong , Jocelyn Chanussot , Naoto Yokoya , Jian Kang , Xiao Xiang Zhu

MultiMedVision: Multi-Modal Medical Vision Framework

Multi-modal medical imaging enables comprehensive diagnostics, yet current foundation models process 2D (e.g. X-ray) and 3D (e.g. CT) data with separate, dimensionality-specific architectures. We present MultiMedVision, a unified framework…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Frank Li , Bardia Khosravi , Mohammadreza Chavoshi , Young Seok Jeon , Theo Dapamede , Hari Trivedi , Janice Newsome , Judy Gichoya

Multiple Kernel Sparse Representations for Supervised and Unsupervised Learning

In complex visual recognition tasks it is typical to adopt multiple descriptors, that describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a…

Computer Vision and Pattern Recognition · Computer Science 2015-06-15 Jayaraman J. Thiagarajan , Karthikeyan Natesan Ramamurthy , Andreas Spanias

Discriminative Sparse Coding on Multi-Manifold for Data Representation and Classification

Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics, etc. However, the conventional sparse coding algorithms and its manifold…

Computer Vision and Pattern Recognition · Computer Science 2013-04-04 Jing-Yan Wang

Multimodal Contrastive Training for Visual Representation Learning

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing visual pre-training methods, which solve a proxy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , Yilin Wang , Michael Maire , Ajinkya Kale , Baldo Faieta

Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations

Semi-supervised learning addresses the issue of limited annotations in medical images effectively, but its performance is often inadequate for complex backgrounds and challenging tasks. Multi-modal fusion methods can significantly improve…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 Dongdong Meng , Sheng Li , Hao Wu , Guoping Wang , Xueqing Yan

Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration

Multimodal learning leverages complementary information derived from different modalities, thereby enhancing performance in medical image segmentation. However, prevailing multimodal learning methods heavily rely on extensive well-annotated…

Computer Vision and Pattern Recognition · Computer Science 2024-09-05 Xiaogen Zhou , Yiyou Sun , Min Deng , Winnie Chiu Wing Chu , Qi Dou

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Heqing Zou , Meng Shen , Chen Chen , Yuchen Hu , Deepu Rajan , Eng Siong Chng

Multimodal Multipart Learning for Action Recognition in Depth Videos

The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial…

Computer Vision and Pattern Recognition · Computer Science 2015-08-03 Amir Shahroudy , Gang Wang , Tian-Tsong Ng , Qingxiong Yang

Understanding Dark Scenes by Contrasting Multi-Modal Observations

Understanding dark scenes based on multi-modal image data is challenging, as both the visible and auxiliary modalities provide limited semantic information for the task. Previous methods focus on fusing the two modalities but neglect the…

Computer Vision and Pattern Recognition · Computer Science 2023-11-22 Xiaoyu Dong , Naoto Yokoya

Model-based Sparse Coding beyond Gaussian Independent Model

Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC)…

Methodology · Statistics 2021-08-24 Xin Xing , Rui Xie , Wenxuan Zhong

Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal Human Activity Recognition

Human Activity Recognition is a field of research where input data can take many forms. Each of the possible input modalities describes human behaviour in a different way, and each has its own strengths and weaknesses. We explore the…

Computer Vision and Pattern Recognition · Computer Science 2022-10-07 Razvan Brinzea , Bulat Khaertdinov , Stylianos Asteriadis

Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss

Emotion recognition is involved in several real-world applications. With an increase in available modalities, automatic understanding of emotions is being performed more accurately. The success in Multimodal Emotion Recognition (MER),…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Riccardo Franceschini , Enrico Fini , Cigdem Beyan , Alessandro Conti , Federica Arrigoni , Elisa Ricci

Unsupervised Feature Learning by Deep Sparse Coding

In this paper, we propose a new unsupervised feature learning framework, namely Deep Sparse Coding (DeepSC), that extends sparse coding to a multi-layer architecture for visual object recognition tasks. The main innovation of the framework…

Machine Learning · Computer Science 2013-12-23 Yunlong He , Koray Kavukcuoglu , Yun Wang , Arthur Szlam , Yanjun Qi

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance…

Computer Vision and Pattern Recognition · Computer Science 2018-07-10 Sanjeel Parekh , Slim Essid , Alexey Ozerov , Ngoc Q. K. Duong , Patrick Pérez , Gaël Richard

A Study on Unsupervised Dictionary Learning and Feature Encoding for Action Classification

Many efforts have been devoted to develop alternative methods to traditional vector quantization in image domain such as sparse coding and soft-assignment. These approaches can be split into a dictionary learning phase and a feature…

Computer Vision and Pattern Recognition · Computer Science 2013-09-03 Xiaojiang Peng , Qiang Peng , Yu Qiao , Junzhou Chen , Mehtab Afzal

Semi-Supervised Sparse Coding

Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a…

Machine Learning · Statistics 2015-01-19 Jim Jing-Yan Wang , Xin Gao