Structure-Regularized Attention for Deformable Object Representation
Abstract
Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements. In this work, we consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses. To this end, we provide a novel structure-regularized attention mechanism, which formalizes feature interaction as structural factorization through the use of a pair of light-weight operations. The instantiated building blocks can be directly incorporated into modern convolutional neural networks, to boost the representational power in an efficient manner. Comprehensive studies on multiple tasks and empirical comparisons with modern attention mechanisms demonstrate the gains brought by our method in terms of both performance and model complexity. We further investigate its effect on feature representations, showing that our trained models can capture diversified representations characterizing object parts without resorting to extra supervision.
Cite
@article{arxiv.2106.06672,
title = {Structure-Regularized Attention for Deformable Object Representation},
author = {Shenao Zhang and Li Shen and Zhifeng Li and Wei Liu},
journal= {arXiv preprint arXiv:2106.06672},
year = {2021}
}
Comments
Published at NeurIPS 2020 Workshop on Object Representations for Learning and Reasoning; code is available at https://github.com/shenao-zhang/StRA