Related papers: Structure-Regularized Attention for Deformable Obj…

On the Generalization of Learned Structured Representations

Despite tremendous progress over the past decade, deep learning methods generally fall short of human-level systematic generalization. It has been argued that explicitly capturing the underlying structure of data should allow connectionist…

Machine Learning · Computer Science 2023-04-26 Andrea Dittadi

AttentionRNN: A Structured Spatial Attention Mechanism

Visual attention mechanisms have proven to be integrally important constituent components of many modern deep neural architectures. They provide an efficient and effective way to utilize visual information selectively, which has shown to be…

Computer Vision and Pattern Recognition · Computer Science 2019-05-24 Siddhesh Khandelwal , Leonid Sigal

Deformable ConvNets v2: More Deformable, Better Results

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects. Through an examination of its adaptive behavior, we observe that while the spatial support for its neural…

Computer Vision and Pattern Recognition · Computer Science 2018-11-29 Xizhou Zhu , Han Hu , Stephen Lin , Jifeng Dai

Describe and Attend to Track: Learning Natural Language guided Structural Representation and Visual Attention for Object Tracking

The tracking-by-detection framework requires a set of positive and negative training samples to learn robust tracking models for precise localization of target objects. However, existing tracking models mostly treat different samples…

Computer Vision and Pattern Recognition · Computer Science 2018-11-28 Xiao Wang , Chenglong Li , Rui Yang , Tianzhu Zhang , Jin Tang , Bin Luo

Object-Centric Learning with Slot Attention

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not…

Machine Learning · Computer Science 2020-10-15 Francesco Locatello , Dirk Weissenborn , Thomas Unterthiner , Aravindh Mahendran , Georg Heigold , Jakob Uszkoreit , Alexey Dosovitskiy , Thomas Kipf

Self-supervised Visual Reinforcement Learning with Object-centric Representations

Autonomous agents need large repertoires of skills to act reasonably on new tasks that they have not seen before. However, acquiring these skills using only a stream of high-dimensional, unstructured, and unlabeled observations is a tricky…

Machine Learning · Computer Science 2021-02-09 Andrii Zadaianchuk , Maximilian Seitzer , Georg Martius

Explicitly Disentangled Representations in Object-Centric Learning

Extracting structured representations from raw visual data is an important and long-standing challenge in machine learning. Recently, techniques for unsupervised learning of object-centric representations have raised growing interest. In…

Computer Vision and Pattern Recognition · Computer Science 2025-01-24 Riccardo Majellaro , Jonathan Collu , Aske Plaat , Thomas M. Moerland

Structured Attention Networks

Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training.…

Computation and Language · Computer Science 2017-02-17 Yoon Kim , Carl Denton , Luong Hoang , Alexander M. Rush

Generalization and Robustness Implications in Object-Centric Learning

The idea behind object-centric representation learning is that natural scenes can better be modeled as compositions of objects and their relations as opposed to distributed representations. This inductive bias can be injected into neural…

Machine Learning · Computer Science 2022-06-10 Andrea Dittadi , Samuele Papa , Michele De Vita , Bernhard Schölkopf , Ole Winther , Francesco Locatello

Attention-Based Explainability for Structure-Property Relationships

Machine learning methods are emerging as a universal paradigm for constructing correlative structure-property relationships in materials science based on multimodal characterization. However, this necessitates development of methods for…

Materials Science · Physics 2025-08-22 Boris N. Slautin , Utkarsh Pratiush , Yongtao Liu , Hiroshi Funakubo , Vladimir V. Shvartsman , Doru C. Lupascu , Sergei V. Kalinin

Beyond Centralization: Provable Communication Efficient Decentralized Multi-Task Learning

Representation learning is a widely adopted framework for learning in data-scarce environments, aiming to extract common features from related tasks. While centralized approaches have been extensively studied, decentralized methods remain…

Machine Learning · Computer Science 2025-12-30 Donghwa Kang , Shana Moothedath

Unsupervised learning of object landmarks by factorized spatial embeddings

Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus…

Computer Vision and Pattern Recognition · Computer Science 2017-08-08 James Thewlis , Hakan Bilen , Andrea Vedaldi

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a…

Computer Vision and Pattern Recognition · Computer Science 2019-04-15 Xizhou Zhu , Dazhi Cheng , Zheng Zhang , Stephen Lin , Jifeng Dai

Learning Structured Text Representations

In this paper, we focus on learning structure-aware document representations from data without recourse to a discourse parser or additional annotations. Drawing inspiration from recent efforts to empower neural networks with a structural…

Computation and Language · Computer Science 2018-02-06 Yang Liu , Mirella Lapata

Deep Attentional Structured Representation Learning for Visual Recognition

Structured representations, such as Bags of Words, VLAD and Fisher Vectors, have proven highly effective to tackle complex visual recognition tasks. As such, they have recently been incorporated into deep architectures. However, while…

Computer Vision and Pattern Recognition · Computer Science 2018-05-16 Krishna Kanth Nakka , Mathieu Salzmann

Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning

In many reinforcement learning tasks, the agent has to learn to interact with many objects of different types and generalize to unseen combinations and numbers of objects. Often a task is a composition of previously learned tasks (e.g.…

Machine Learning · Computer Science 2023-07-19 Fan Feng , Sara Magliacane

Contextual Interference Reduction by Selective Fine-Tuning of Neural Networks

Feature disentanglement of the foreground target objects and the background surrounding context has not been yet fully accomplished. The lack of network interpretability prevents advancing for feature disentanglement and better…

Computer Vision and Pattern Recognition · Computer Science 2020-11-24 Mahdi Biparva , John Tsotsos

Variational Structured Attention Networks for Deep Visual Representation Learning

Convolutional neural networks have enabled major progresses in addressing pixel-level prediction tasks such as semantic segmentation, depth estimation, surface normal prediction and so on, benefiting from their powerful capabilities in…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Guanglei Yang , Paolo Rota , Xavier Alameda-Pineda , Dan Xu , Mingli Ding , Elisa Ricci

Contextualized word senses: from attention to compositionality

The neural architectures of language models are becoming increasingly complex, especially that of Transformers, based on the attention mechanism. Although their application to numerous natural language processing tasks has proven to be very…

Computation and Language · Computer Science 2023-12-04 Pablo Gamallo

Teaching Compositionality to CNNs

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting…

Computer Vision and Pattern Recognition · Computer Science 2017-06-15 Austin Stone , Huayan Wang , Michael Stark , Yi Liu , D. Scott Phoenix , Dileep George