Related papers: Quality-based Multimodal Classification Using Tree…

Multimodal Task-Driven Dictionary Learning for Image Classification

Dictionary learning algorithms have been successfully used for both reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are mostly developed…

Machine Learning · Statistics 2016-01-20 Soheil Bahrampour , Nasser M. Nasrabadi , Asok Ray , W. Kenneth Jenkins

Sparse Fusion for Multimodal Transformers

Multimodal classification is a core task in human-centric machine learning. We observe that information is highly complementary across modalities, thus unimodal information can be drastically sparsified prior to multimodal fusion without…

Computer Vision and Pattern Recognition · Computer Science 2021-11-29 Yi Ding , Alex Rich , Mason Wang , Noah Stier , Matthew Turk , Pradeep Sen , Tobias Höllerer

Multi-Focus Image Fusion Using Sparse Representation and Coupled Dictionary Learning

We address the multi-focus image fusion problem, where multiple images captured with different focal settings are to be fused into an all-in-focus image of higher quality. Algorithms for this problem necessarily admit the source image…

Computer Vision and Pattern Recognition · Computer Science 2019-05-06 Farshad G. Veshki , Sergiy A. Vorobyov

Efficient Large-Scale Multi-Modal Classification

While the incipient internet was largely text-based, the modern digital world is becoming increasingly multi-modal. Here, we examine multi-modal classification where one modality is discrete, e.g. text, and the other is continuous, e.g.…

Computation and Language · Computer Science 2018-02-09 D. Kiela , E. Grave , A. Joulin , T. Mikolov

Flexible tree-structured regression for clustered data with an application to quality of life in older adults

Tree-structured models are a powerful alternative to parametric regression models if non-linear effects and interactions are present in the data. Yet, classical tree-structured models might not be appropriate if data comes in clusters of…

Methodology · Statistics 2025-01-23 Nikolai Spuck , Matthias Schmid , Moritz Berger

Multimodal sparse representation learning and applications

Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between…

Machine Learning · Computer Science 2016-03-03 Miriam Cha , Youngjune Gwon , H. T. Kung

Random Similarity Forests

The wealth of data being gathered about humans and their surroundings drives new machine learning applications in various fields. Consequently, more and more often, classifiers are trained using not only numerical data but also complex data…

Machine Learning · Computer Science 2022-04-13 Maciej Piernik , Dariusz Brzezinski , Pawel Zawadzki

Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches

Information fusion is used widely to improve document classification by the integration of multiple data sources (multimodal) or representations (multiview). However, the field lacks a unified framework, a quantitative synthesis of its…

Computation and Language · Computer Science 2026-05-27 Marcin Michał Mirończuk

EmbraceNet: A robust deep learning architecture for multimodal classification

Classification using multimodal data arises in many machine learning applications. It is crucial not only to model cross-modal relationship effectively but also to ensure robustness against loss of part of data or modalities. In this paper,…

Machine Learning · Computer Science 2019-04-22 Jun-Ho Choi , Jong-Seok Lee

Beyond Images: Adaptive Fusion of Visual and Textual Data for Food Classification

This study introduces a novel multimodal food recognition framework that effectively combines visual and textual modalities to enhance classification accuracy and robustness. The proposed approach employs a dynamic multimodal fusion…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Prateek Mittal , Puneet Goyal , Joohi Chauhan

Query Adaptive Late Fusion for Image Retrieval

Feature fusion is a commonly used strategy in image retrieval tasks, which aggregates the matching responses of multiple visual features. Feasible sets of features can be either descriptors (SIFT, HSV) for an entire image or the same…

Information Retrieval · Computer Science 2018-11-01 Zhongdao Wang , Liang Zheng , Shengjin Wang

Quality-Aware Multimodal Biometric Recognition

We present a quality-aware multimodal recognition framework that combines representations from multiple biometric traits with varying quality and number of samples to achieve increased recognition accuracy by extracting complimentary…

Computer Vision and Pattern Recognition · Computer Science 2021-12-14 Sobhan Soleymani , Ali Dabouei , Fariborz Taherkhani , Seyed Mehdi Iranmanesh , Jeremy Dawson , Nasser M. Nasrabadi

Structure-guided Deep Multi-View Clustering

Deep multi-view clustering seeks to utilize the abundant information from multiple views to improve clustering performance. However, most of the existing clustering methods often neglect to fully mine multi-view structural information and…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Jinrong Cui , Xiaohuang Wu , Haitao Zhang , Chongjie Dong , Jie Wen

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity…

Optimization and Control · Mathematics 2020-02-24 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Modeling with Categorical Features via Exact Fusion and Sparsity Regularisation

We study the high-dimensional linear regression problem with categorical predictors that have many levels. We propose a new estimation approach, which performs model compression via two mechanisms by simultaneously encouraging (a)…

Methodology · Statistics 2026-03-30 Kayhan Behdin , Riade Benbaki , Peter Radchenko , Rahul Mazumder

SynerGraph: An Integrated Graph Convolution Network for Multimodal Recommendation

This article presents a novel approach to multimodal recommendation systems, focusing on integrating and purifying multimodal data. Our methodology starts by developing a filter to remove noise from various types of data, making the…

Information Retrieval · Computer Science 2024-05-30 Mert Burabak , Tevfik Aytekin

Sparse Shortcuts: Facilitating Efficient Fusion in Multimodal Large Language Models

With the remarkable success of large language models (LLMs) in natural language understanding and generation, multimodal large language models (MLLMs) have rapidly advanced in their ability to process data across multiple modalities. While…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Jingrui Zhang , Feng Liang , Yong Zhang , Wei Wang , Runhao Zeng , Xiping Hu

Weakly Supervised Instance Attention for Multisource Fine-Grained Object Recognition with an Application to Tree Species Classification

Multisource image analysis that leverages complementary spectral, spatial, and structural information benefits fine-grained object recognition that aims to classify an object into one of many similar subcategories. However, for multisource…

Computer Vision and Pattern Recognition · Computer Science 2021-05-27 Bulut Aygunes , Ramazan Gokberk Cinbis , Selim Aksoy

On multivariate randomized classification trees: $l_0$-based sparsity, VC~dimension and decomposition methods

Decision trees are widely-used classification and regression models because of their interpretability and good accuracy. Classical methods such as CART are based on greedy approaches but a growing attention has recently been devoted to…

Machine Learning · Computer Science 2021-12-16 Edoardo Amaldi , Antonio Consolo , Andrea Manno

Meta Fusion: A Unified Framework For Multimodality Fusion with Mutual Learning

Developing effective multimodal data fusion strategies has become increasingly essential for improving the predictive power of statistical machine learning methods across a wide range of applications, from autonomous driving to medical…

Machine Learning · Computer Science 2025-07-29 Ziyi Liang , Annie Qu , Babak Shahbaba