English
Related papers

Related papers: Training Multimodal Systems for Classification wit…

200 papers

Human perception of the empirical world involves recognizing the diverse appearances, or 'modalities', of underlying objects. Despite the longstanding consideration of this perspective in philosophy and cognitive science, the study of…

Machine Learning · Computer Science 2023-12-19 Zhou Lu

Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution. Naturally, they perform well at generalizing within the…

Machine Learning · Computer Science 2021-08-06 Aaron Eisermann , Jae Hee Lee , Cornelius Weber , Stefan Wermter

The rapid evolution of machine learning has propelled neural networks to unprecedented success across diverse domains. In particular, multimodal learning has emerged as a transformative paradigm, leveraging complementary information from…

Machine Learning · Computer Science 2025-11-14 Fushuo Huo

Multi-modal learning is a fast growing area in artificial intelligence. It tries to help machines understand complex things by combining information from different sources, like images, text, and audio. By using the strengths of each…

Machine Learning · Computer Science 2025-12-22 Qihang Jin , Enze Ge , Yuhang Xie , Hongying Luo , Junhao Song , Ziqian Bi , Chia Xin Liang , Jibin Guan , Joe Yeong , Xinyuan Song , Junfeng Hao

Multimodal learning has mainly focused on learning large models on, and fusing feature representations from, different modalities for better performances on downstream tasks. In this work, we take a detour from this trend and study the…

Computer Vision and Pattern Recognition · Computer Science 2023-05-08 Yifeng Shi , Marc Niethammer

A multi-modal machine learning system uses multiple unique data sources and types to improve its performance. This article proposes a system that combines results from several types of models, all of which are trained on different data…

Machine Learning · Computer Science 2024-02-05 Aaron Mullen , Samuel E. Armstrong , Jasmine Perdeh , Bjorn Bauer , Jeffrey Talbert , V. K. Cody Bumgardner

Multimodal learning integrates information from different modalities to enhance model performance, yet it often suffers from modality imbalance, where dominant modalities overshadow weaker ones during joint optimization. This paper reveals…

Machine Learning · Computer Science 2025-10-17 Xiaoyu Ma , Hao Chen

Consider end-to-end training of a multi-modal vs. a single-modal network on a task with multiple input modalities: the multi-modal network receives more information, so it should match or outperform its single-modal counterpart. In our…

Computer Vision and Pattern Recognition · Computer Science 2020-04-06 Weiyao Wang , Du Tran , Matt Feiszli

Learning-enabled control systems increasingly rely on multiple sensing modalities (e.g., vision, audio, language, etc.) for perception and decision support. A key challenge is that multi-modal sensor training dynamics are often imbalanced:…

Machine Learning · Computer Science 2026-04-01 Heshan Fernando , Quan Xiao , Parikshit Ram , Yi Zhou , Horst Samulowitz , Nathalie Baracaldo , Tianyi Chen

Many prediction problems, such as those that arise in the context of robotics, have a simplifying underlying structure that, if known, could accelerate learning. In this paper, we present a strategy for learning a set of neural network…

Machine Learning · Computer Science 2019-05-06 Ferran Alet , Tomás Lozano-Pérez , Leslie P. Kaelbling

Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as…

Machine Learning · Computer Science 2017-08-02 Tadas Baltrušaitis , Chaitanya Ahuja , Louis-Philippe Morency

Multimodal learning helps to comprehensively understand the world, by integrating different senses. Accordingly, multiple input modalities are expected to boost model performance, but we actually find that they are not fully exploited even…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Xiaokang Peng , Yake Wei , Andong Deng , Dong Wang , Di Hu

Multimodal machine learning has gained significant attention in recent years due to its potential for integrating information from multiple modalities to enhance learning and decision-making processes. However, it is commonly observed that…

Machine Learning · Computer Science 2025-09-12 Sahiti Yerramilli , Jayant Sravan Tamarapalli , Jonathan Francis , Eric Nyberg

Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on…

Recent technological advancements in multimodal machine learning--including the rise of large language models (LLMs)--have improved our ability to collect, process, and analyze diverse multimodal data such as speech, video, and eye gaze in…

The ability to quickly learn a new task with minimal instruction - known as few-shot learning - is a central aspect of intelligent agents. Classical few-shot benchmarks make use of few-shot samples from a single modality, but such samples…

Computer Vision and Pattern Recognition · Computer Science 2024-08-29 Zhiqiu Lin , Samuel Yu , Zhiyi Kuang , Deepak Pathak , Deva Ramanan

One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, annotation of multimodal data is challenging and expensive. Recently, self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2020-12-11 Elad Amrani , Rami Ben-Ari , Daniel Rotman , Alex Bronstein

Traditional multimodal methods often assume static modality quality, which limits their adaptability in dynamic real-world scenarios. Thus, dynamical multimodal methods are proposed to assess modality quality and adjust their contribution…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Shicai Wei , Kaijie Zhang , Luyi Chen , Tao He , Guiduo Duan

The natural world is abundant with concepts expressed via visual, acoustic, tactile, and linguistic modalities. Much of the existing progress in multimodal learning, however, focuses primarily on problems where the same set of modalities…

Machine Learning · Computer Science 2020-12-08 Paul Pu Liang , Peter Wu , Liu Ziyin , Louis-Philippe Morency , Ruslan Salakhutdinov

A core aspect of human intelligence is the ability to learn new tasks quickly and switch between them flexibly. Here, we describe a modular continual reinforcement learning paradigm inspired by these abilities. We first introduce a visual…

Machine Learning · Computer Science 2017-12-13 Kevin T. Feigelis , Blue Sheffer , Daniel L. K. Yamins
‹ Prev 1 2 3 10 Next ›