English
Related papers

Related papers: Multimodal Sparse Coding for Event Detection

200 papers

Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between…

Machine Learning · Computer Science 2016-03-03 Miriam Cha , Youngjune Gwon , H. T. Kung

Multimodal representation learning has demonstrated remarkable potential in enabling models to process and integrate diverse data modalities, such as text and images, for improved understanding and performance. While the medical domain can…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Shuvendu Roy , Franklin Ogidi , Ali Etemad , Elham Dolatabadi , Arash Afkanpour

Recent advances in representation learning have demonstrated an ability to represent information from different modalities such as video, text, and audio in a single high-level embedding vector. In this work we present a self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2021-06-11 Alexander H. Liu , SouYoung Jin , Cheng-I Jeff Lai , Andrew Rouditchenko , Aude Oliva , James Glass

Due to the ever-growing diversity of the data source, multi-modality feature learning has attracted more and more attention. However, most of these methods are designed by jointly learning feature representation from multi-modalities that…

Computer Vision and Pattern Recognition · Computer Science 2020-06-09 Danfeng Hong , Jocelyn Chanussot , Naoto Yokoya , Jian Kang , Xiao Xiang Zhu

Multi-modal medical imaging enables comprehensive diagnostics, yet current foundation models process 2D (e.g. X-ray) and 3D (e.g. CT) data with separate, dimensionality-specific architectures. We present MultiMedVision, a unified framework…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Frank Li , Bardia Khosravi , Mohammadreza Chavoshi , Young Seok Jeon , Theo Dapamede , Hari Trivedi , Janice Newsome , Judy Gichoya

In complex visual recognition tasks it is typical to adopt multiple descriptors, that describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a…

Computer Vision and Pattern Recognition · Computer Science 2015-06-15 Jayaraman J. Thiagarajan , Karthikeyan Natesan Ramamurthy , Andreas Spanias

Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics, etc. However, the conventional sparse coding algorithms and its manifold…

Computer Vision and Pattern Recognition · Computer Science 2013-04-04 Jing-Yan Wang

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing visual pre-training methods, which solve a proxy…

Computer Vision and Pattern Recognition · Computer Science 2021-04-28 Xin Yuan , Zhe Lin , Jason Kuen , Jianming Zhang , Yilin Wang , Michael Maire , Ajinkya Kale , Baldo Faieta

Semi-supervised learning addresses the issue of limited annotations in medical images effectively, but its performance is often inadequate for complex backgrounds and challenging tasks. Multi-modal fusion methods can significantly improve…

Computer Vision and Pattern Recognition · Computer Science 2025-06-23 Dongdong Meng , Sheng Li , Hao Wu , Guoping Wang , Xueqing Yan

Multimodal learning leverages complementary information derived from different modalities, thereby enhancing performance in medical image segmentation. However, prevailing multimodal learning methods heavily rely on extensive well-annotated…

Computer Vision and Pattern Recognition · Computer Science 2024-09-05 Xiaogen Zhou , Yiyou Sun , Min Deng , Winnie Chiu Wing Chu , Qi Dou

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Heqing Zou , Meng Shen , Chen Chen , Yuchen Hu , Deepu Rajan , Eng Siong Chng

The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial…

Computer Vision and Pattern Recognition · Computer Science 2015-08-03 Amir Shahroudy , Gang Wang , Tian-Tsong Ng , Qingxiong Yang

Understanding dark scenes based on multi-modal image data is challenging, as both the visible and auxiliary modalities provide limited semantic information for the task. Previous methods focus on fusing the two modalities but neglect the…

Computer Vision and Pattern Recognition · Computer Science 2023-11-22 Xiaoyu Dong , Naoto Yokoya

Sparse coding aims to model data vectors as sparse linear combinations of basis elements, but a majority of related studies are restricted to continuous data without spatial or temporal structure. A new model-based sparse coding (MSC)…

Methodology · Statistics 2021-08-24 Xin Xing , Rui Xie , Wenxuan Zhong

Human Activity Recognition is a field of research where input data can take many forms. Each of the possible input modalities describes human behaviour in a different way, and each has its own strengths and weaknesses. We explore the…

Computer Vision and Pattern Recognition · Computer Science 2022-10-07 Razvan Brinzea , Bulat Khaertdinov , Stylianos Asteriadis

Emotion recognition is involved in several real-world applications. With an increase in available modalities, automatic understanding of emotions is being performed more accurately. The success in Multimodal Emotion Recognition (MER),…

Computer Vision and Pattern Recognition · Computer Science 2022-07-26 Riccardo Franceschini , Enrico Fini , Cigdem Beyan , Alessandro Conti , Federica Arrigoni , Elisa Ricci

In this paper, we propose a new unsupervised feature learning framework, namely Deep Sparse Coding (DeepSC), that extends sparse coding to a multi-layer architecture for visual object recognition tasks. The main innovation of the framework…

Machine Learning · Computer Science 2013-12-23 Yunlong He , Koray Kavukcuoglu , Yun Wang , Arthur Szlam , Yanjun Qi

Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance…

Computer Vision and Pattern Recognition · Computer Science 2018-07-10 Sanjeel Parekh , Slim Essid , Alexey Ozerov , Ngoc Q. K. Duong , Patrick Pérez , Gaël Richard

Many efforts have been devoted to develop alternative methods to traditional vector quantization in image domain such as sparse coding and soft-assignment. These approaches can be split into a dictionary learning phase and a feature…

Computer Vision and Pattern Recognition · Computer Science 2013-09-03 Xiaojiang Peng , Qiang Peng , Yu Qiao , Junzhou Chen , Mehtab Afzal

Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a…

Machine Learning · Statistics 2015-01-19 Jim Jing-Yan Wang , Xin Gao
‹ Prev 1 2 3 10 Next ›