Related papers: Multi-Granularity Modularized Network for Abstract…

Deep Non-Monotonic Reasoning for Visual Abstract Reasoning Tasks

While achieving unmatched performance on many well-defined tasks, deep learning models have also been used to solve visual abstract reasoning tasks, which are relatively less well-defined, and have been widely used to measure human…

Computer Vision and Pattern Recognition · Computer Science 2023-02-15 Yuan Yang , Deepayan Sanyal , Joel Michelson , James Ainooson , Maithilee Kunda

ViTCN: Vision Transformer Contrastive Network For Reasoning

Machine learning models have achieved significant milestones in various domains, for example, computer vision models have an exceptional result in object recognition, and in natural language processing, where Large Language Models (LLM)…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Bo Song , Yuanhao Xu , Yichao Wu

OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

A key aspect of human intelligence is the ability to imagine -- composing learned concepts in novel ways -- to make sense of new scenarios. Such capacity is not yet attained for machine learning systems. In this work, in the context of…

Artificial Intelligence · Computer Science 2023-10-31 Rim Assouel , Pau Rodriguez , Perouz Taslakian , David Vazquez , Yoshua Bengio

Multimodal Representations for Teacher-Guided Compositional Visual Reasoning

Neural Module Networks (NMN) are a compelling method for visual question answering, enabling the translation of a question into a program consisting of a series of reasoning sub-tasks that are sequentially executed on the image to produce…

Computation and Language · Computer Science 2023-10-25 Wafa Aissa , Marin Ferecatu , Michel Crucianu

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Answering questions that require reading texts in an image is challenging for current models. One key difficulty of this task is that rare, polysemous, and ambiguous words frequently appear in images, e.g., names of places, products, and…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Difei Gao , Ke Li , Ruiping Wang , Shiguang Shan , Xilin Chen

Learning to reason over visual objects

A core component of human intelligence is the ability to identify abstract patterns inherent in complex, high-dimensional perceptual data, as exemplified by visual reasoning tasks such as Raven's Progressive Matrices (RPM). Motivated by the…

Computer Vision and Pattern Recognition · Computer Science 2023-10-30 Shanka Subhra Mondal , Taylor Webb , Jonathan D. Cohen

A Survey of Multimodal Mathematical Reasoning: From Perception, Alignment to Reasoning

Multimodal Mathematical Reasoning (MMR) has recently attracted increasing attention for its capability to solve mathematical problems involving both textual and visual modalities. However, current models still face significant challenges in…

Artificial Intelligence · Computer Science 2026-04-15 Tianyu Yang , Sihong Wu , Yilun Zhao , Zhenwen Liang , Lisen Dai , Chen Zhao , Minhao Cheng , Arman Cohan , Xiangliang Zhang

RAVEN: A Dataset for Relational and Analogical Visual rEasoNing

Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and tracking. Unfortunately, there is still an enormous performance gap between artificial vision systems and…

Computer Vision and Pattern Recognition · Computer Science 2019-03-08 Chi Zhang , Feng Gao , Baoxiong Jia , Yixin Zhu , Song-Chun Zhu

Learning Abstract Visual Reasoning via Task Decomposition: A Case Study in Raven Progressive Matrices

Learning to perform abstract reasoning often requires decomposing the task in question into intermediate subgoals that are not specified upfront, but need to be autonomously devised by the learner. In Raven Progressive Matrices (RPM), the…

Artificial Intelligence · Computer Science 2024-03-08 Jakub Kwiatkowski , Krzysztof Krawiec

Modeling Gestalt Visual Reasoning on the Raven's Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques

Psychologists recognize Raven's Progressive Matrices as a very effective test of general human intelligence. While many computational models have been developed by the AI community to investigate different forms of top-down, deliberative…

Computer Vision and Pattern Recognition · Computer Science 2019-11-27 Tianyu Hua , Maithilee Kunda

DAReN: A Collaborative Approach Towards Reasoning And Disentangling

Computational learning approaches to solving visual reasoning tests, such as Raven's Progressive Matrices (RPM), critically depend on the ability to identify the visual concepts used in the test (i.e., the representation) as well as the…

Machine Learning · Computer Science 2022-07-01 Pritish Sahu , Kalliopi Basioti , Vladimir Pavlovic

A Feature-based Generalizable Prediction Model for Both Perceptual and Abstract Reasoning

A hallmark of human intelligence is the ability to infer abstract rules from limited experience and apply these rules to unfamiliar situations. This capacity is widely studied in the visual domain using the Raven's Progressive Matrices.…

Artificial Intelligence · Computer Science 2025-12-22 Quan Do , Thomas M. Morin , Chantal E. Stern , Michael E. Hasselmo

Meta Module Network for Compositional Visual Reasoning

Neural Module Network (NMN) exhibits strong interpretability and compositionality thanks to its handcrafted neural modules with explicit multi-hop reasoning capability. However, most NMNs suffer from two critical drawbacks: 1) scalability:…

Computer Vision and Pattern Recognition · Computer Science 2020-11-10 Wenhu Chen , Zhe Gan , Linjie Li , Yu Cheng , William Wang , Jingjing Liu

Modal Logical Neural Networks

We propose Modal Logical Neural Networks (MLNNs), a neurosymbolic framework that integrates deep learning with the formal semantics of modal logic, enabling reasoning about necessity and possibility. Drawing on Kripke semantics, we…

Machine Learning · Computer Science 2026-02-13 Antonin Sulc

Learning Differentiable Logic Programs for Abstract Visual Reasoning

Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine…

Machine Learning · Computer Science 2025-07-08 Hikaru Shindo , Viktor Pfanschilling , Devendra Singh Dhami , Kristian Kersting

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced. This is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Damien Teney , Peng Wang , Jiewei Cao , Lingqiao Liu , Chunhua Shen , Anton van den Hengel

Unsupervised Abstract Reasoning for Raven's Problem Matrices

Raven's Progressive Matrices (RPM) is highly correlated with human intelligence, and it has been widely used to measure the abstract reasoning ability of humans. In this paper, to study the abstract reasoning capability of deep neural…

Computer Vision and Pattern Recognition · Computer Science 2021-09-22 Tao Zhuo , Qiang Huang , Mohan Kankanhalli

From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering

In order to achieve a general visual question answering (VQA) system, it is essential to learn to answer deeper questions that require compositional reasoning on the image and external knowledge. Meanwhile, the reasoning process should be…

Computer Vision and Pattern Recognition · Computer Science 2022-06-28 Zihao Zhu

Visual Question Reasoning on General Dependency Tree

The collaborative reasoning for understanding each image-question pair is very critical but under-explored for an interpretable Visual Question Answering (VQA) system. Although very recent works also tried the explicit compositional…

Computer Vision and Pattern Recognition · Computer Science 2018-04-03 Qingxing Cao , Xiaodan Liang , Bailing Li , Guanbin Li , Liang Lin

Multi-Objective Matrix Normalization for Fine-grained Visual Recognition

Bilinear pooling achieves great success in fine-grained visual recognition (FGVC). Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features, but some problems, e.g.,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-13 Shaobo Min , Hantao Yao , Hongtao Xie , Zheng-Jun Zha , Yongdong Zhang