Related papers: Multimodal learning with graphs

Learning on Multimodal Graphs: A Survey

Multimodal data pervades various domains, including healthcare, social media, and transportation, where multimodal graphs play a pivotal role. Machine learning on multimodal graphs, referred to as multimodal graph learning (MGL), is…

Machine Learning · Computer Science 2024-02-09 Ciyuan Peng , Jiayuan He , Feng Xia

Mixed Graphical Models for Causal Analysis of Multi-modal Variables

Graphical causal models are an important tool for knowledge discovery because they can represent both the causal relations between variables and the multivariate probability distributions over the data. Once learned, causal graphs can be…

Artificial Intelligence · Computer Science 2017-04-11 Andrew J Sedgewick , Joseph D. Ramsey , Peter Spirtes , Clark Glymour , Panayiotis V. Benos

Multimodal Graph Representation Learning with Dynamic Information Pathways

Multimodal graphs, where nodes contain heterogeneous features such as images and text, are increasingly common in real-world applications. Effectively learning on such graphs requires both adaptive intra-modal message passing and efficient…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Xiaobin Hong , Mingkai Lin , Xiaoli Wang , Chaoqun Wang , Wenzhong Li

Multimodal Graph Learning for Generative Tasks

Multimodal learning combines multiple data modalities, broadening the types and complexity of data our models can utilize: for example, from plain text to image-caption pairs. Most multimodal learning algorithms focus on modeling simple…

Artificial Intelligence · Computer Science 2023-10-13 Minji Yoon , Jing Yu Koh , Bryan Hooi , Ruslan Salakhutdinov

A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language

Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality. Since the hierarchy…

Machine Learning · Computer Science 2022-09-14 Bing Su , Dazhao Du , Zhao Yang , Yujie Zhou , Jiangmeng Li , Anyi Rao , Hao Sun , Zhiwu Lu , Ji-Rong Wen

Relational inductive biases, deep learning, and graph networks

Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which…

Machine Learning · Computer Science 2018-10-18 Peter W. Battaglia , Jessica B. Hamrick , Victor Bapst , Alvaro Sanchez-Gonzalez , Vinicius Zambaldi , Mateusz Malinowski , Andrea Tacchetti , David Raposo , Adam Santoro , Ryan Faulkner , Caglar Gulcehre , Francis Song , Andrew Ballard , Justin Gilmer , George Dahl , Ashish Vaswani , Kelsey Allen , Charles Nash , Victoria Langston , Chris Dyer , Nicolas Heess , Daan Wierstra , Pushmeet Kohli , Matt Botvinick , Oriol Vinyals , Yujia Li , Razvan Pascanu

Towards Graph Prompt Learning: A Survey and Beyond

Large-scale "pre-train and prompt learning" paradigms have demonstrated remarkable adaptability, enabling broad applications across diverse domains such as question answering, image recognition, and multimodal retrieval. This approach fully…

Machine Learning · Computer Science 2024-09-25 Qingqing Long , Yuchen Yan , Peiyan Zhang , Chen Fang , Wentao Cui , Zhiyuan Ning , Meng Xiao , Ning Cao , Xiao Luo , Lingjun Xu , Shiyue Jiang , Zheng Fang , Chong Chen , Xian-Sheng Hua , Yuanchun Zhou

HyperLearn: A Distributed Approach for Representation Learning in Datasets With Many Modalities

Multimodal datasets contain an enormous amount of relational information, which grows exponentially with the introduction of new modalities. Learning representations in such a scenario is inherently complex due to the presence of multiple…

Machine Learning · Computer Science 2019-09-24 Devanshu Arya , Stevan Rudinac , Marcel Worring

A survey of multimodal deep generative models

Multimodal learning is a framework for building models that make predictions based on different types of modalities. Important challenges in multimodal learning are the inference of shared representations from arbitrary modalities and…

Machine Learning · Computer Science 2022-07-06 Masahiro Suzuki , Yutaka Matsuo

Multimodal Representation Learning and Fusion

Multi-modal learning is a fast growing area in artificial intelligence. It tries to help machines understand complex things by combining information from different sources, like images, text, and audio. By using the strengths of each…

Machine Learning · Computer Science 2025-12-22 Qihang Jin , Enze Ge , Yuhang Xie , Hongying Luo , Junhao Song , Ziqian Bi , Chia Xin Liang , Jibin Guan , Joe Yeong , Xinyuan Song , Junfeng Hao

Bayesian Deep Learning for Graphs

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an…

Machine Learning · Computer Science 2022-02-28 Federico Errica

Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy

Multimodal learning, a rapidly evolving field in artificial intelligence, seeks to construct more versatile and robust systems by integrating and analyzing diverse types of data, including text, images, audio, and video. Inspired by the…

Artificial Intelligence · Computer Science 2024-12-24 Priyaranjan Pattnayak , Hitesh Laxmichand Patel , Bhargava Kumar , Amit Agarwal , Ishan Banerjee , Srikant Panda , Tejaswini Kumar

Learning Factorized Multimodal Representations

Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information,…

Machine Learning · Computer Science 2019-05-15 Yao-Hung Hubert Tsai , Paul Pu Liang , Amir Zadeh , Louis-Philippe Morency , Ruslan Salakhutdinov

Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning

Graph machine learning has made significant strides in recent years, yet the integration of visual information with graph structure and its potential for improving performance in downstream tasks remains an underexplored area. To address…

Machine Learning · Computer Science 2025-04-01 Jing Zhu , Yuhang Zhou , Shengyi Qian , Zhongmou He , Tong Zhao , Neil Shah , Danai Koutra

State of the Art and Potentialities of Graph-level Learning

Graphs have a superior ability to represent relational data, like chemical compounds, proteins, and social networks. Hence, graph-level learning, which takes a set of graphs as input, has been applied to many tasks including comparison,…

Machine Learning · Computer Science 2023-05-26 Zhenyu Yang , Ge Zhang , Jia Wu , Jian Yang , Quan Z. Sheng , Shan Xue , Chuan Zhou , Charu Aggarwal , Hao Peng , Wenbin Hu , Edwin Hancock , Pietro Liò

Multi-modal Graph Learning for Disease Prediction

Benefiting from the powerful expressive capability of graphs, graph-based approaches have achieved impressive performance in various biomedical applications. Most existing methods tend to define the adjacency matrix among samples manually…

Machine Learning · Computer Science 2021-07-02 Shuai Zheng , Zhenfeng Zhu , Zhizhe Liu , Zhenyu Guo , Yang Liu , Yao Zhao

Towards Multimodal Graph Large Language Model

Multi-modal graphs, which integrate diverse multi-modal features and relations, are ubiquitous in real-world applications. However, existing multi-modal graph learning methods are typically trained from scratch for specific graph data and…

Machine Learning · Computer Science 2025-11-26 Xin Wang , Zeyang Zhang , Linxin Xiao , Haibo Chen , Chendi Ge , Wenwu Zhu

Graph AI in Medicine

In clinical artificial intelligence (AI), graph representation learning, mainly through graph neural networks (GNNs), stands out for its capability to capture intricate relationships within structured clinical datasets. With diverse data --…

Machine Learning · Computer Science 2023-12-13 Ruth Johnson , Michelle M. Li , Ayush Noori , Owen Queen , Marinka Zitnik

Analyzing Unaligned Multimodal Sequence via Graph Convolution and Graph Pooling Fusion

In this paper, we study the task of multimodal sequence analysis which aims to draw inferences from visual, language and acoustic sequences. A majority of existing works generally focus on aligned fusion, mostly at word level, of the three…

Artificial Intelligence · Computer Science 2021-04-26 Sijie Mai , Songlong Xing , Jiaxuan He , Ying Zeng , Haifeng Hu

Graph-based Interaction Augmentation Network for Robust Multimodal Sentiment Analysis

The inevitable modality imperfection in real-world scenarios poses significant challenges for Multimodal Sentiment Analysis (MSA). While existing methods tailor reconstruction or joint representation learning strategies to restore missing…

Multimedia · Computer Science 2025-08-05 Hu Zhangfeng , Shi mengxin