Related papers: Multi-Objective Matrix Normalization for Fine-grai…

Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization

Training a fine-grained image recognition model with limited data presents a significant challenge, as the subtle differences between categories may not be easily discernible amidst distracting noise patterns. One commonly employed strategy…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Avraham Chapman , Haiming Xu , Lingqiao Liu

MoNet: Moments Embedding Network

Bilinear pooling has been recently proposed as a feature encoding layer, which can be used after the convolutional layers of a deep network, to improve performance in multiple vision tasks. Different from conventional global average pooling…

Computer Vision and Pattern Recognition · Computer Science 2018-04-02 Mengran Gou , Fei Xiong , Octavia Camps , Mario Sznaier

Is Second-order Information Helpful for Large-scale Visual Recognition?

By stacking layers of convolution and nonlinearity, convolutional networks (ConvNets) effectively learn from low-level to high-level features and discriminative representations. Since the end goal of large-scale recognition is to delineate…

Computer Vision and Pattern Recognition · Computer Science 2018-04-03 Peihua Li , Jiangtao Xie , Qilong Wang , Wangmeng Zuo

RegBN: Batch Normalization of Multimodal Data with Regularization

Recent years have witnessed a surge of interest in integrating high-dimensional data captured by multisource sensors, driven by the impressive success of neural networks in the integration of multimodal data. However, the integration of…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Morteza Ghahremani , Christian Wachinger

Improved Bilinear Pooling with CNNs

Bilinear pooling of Convolutional Neural Network (CNN) features [22, 23], and their compact variants [10], have been shown to be effective at fine-grained recognition, scene categorization, texture recognition, and visual question-answering…

Computer Vision and Pattern Recognition · Computer Science 2017-07-24 Tsung-Yu Lin , Subhransu Maji

Learning Maximally Monotone Operators for Image Recovery

We introduce a new paradigm for solving regularized variational problems. These are typically formulated to address ill-posed inverse problems encountered in signal and image processing. The objective function is traditionally defined by…

Optimization and Control · Mathematics 2021-04-22 Jean-Christophe Pesquet , Audrey Repetti , Matthieu Terris , Yves Wiaux

Multi-Granularity Modularized Network for Abstract Visual Reasoning

Abstract visual reasoning connects mental abilities to the physical world, which is a crucial factor in cognitive development. Most toddlers display sensitivity to this skill, but it is not easy for machines. Aimed at it, we focus on the…

Artificial Intelligence · Computer Science 2020-07-13 Xiangru Tang , Haoyuan Wang , Xiang Pan , Jiyang Qi

Compare More Nuanced:Pairwise Alignment Bilinear Network For Few-shot Fine-grained Learning

The recognition ability of human beings is developed in a progressive way. Usually, children learn to discriminate various objects from coarse to fine-grained with limited supervision. Inspired by this learning process, we propose a simple…

Computer Vision and Pattern Recognition · Computer Science 2020-01-22 Huaxi Huang , Junjie Zhang , Jian Zhang , Qiang Wu , Jingsong Xu

Deep Modularity Networks with Diversity-Preserving Regularization

Graph clustering plays a crucial role in graph representation learning but often faces challenges in achieving feature-space diversity. While Deep Modularity Networks (DMoN) leverage modularity maximization and collapse regularization to…

Machine Learning · Computer Science 2025-11-04 Yasmin Salehi , Dennis Giannacopoulos

Adaptive Multi-Order Graph Regularized NMF with Dual Sparsity for Hyperspectral Unmixing

Hyperspectral unmixing (HU) is a critical yet challenging task in remote sensing. However, existing nonnegative matrix factorization (NMF) methods with graph learning mostly focus on first-order or second-order nearest neighbor…

Computer Vision and Pattern Recognition · Computer Science 2025-09-26 Hui Chen , Liangyu Liu , Xianchao Xiu , Wanquan Liu

Visual Understanding via Multi-Feature Shared Learning with Global Consistency

Image/video data is usually represented with multiple visual features. Fusion of multi-source information for establishing the attributes has been widely recognized. Multi-feature visual recognition has recently received much attention in…

Computer Vision and Pattern Recognition · Computer Science 2016-11-15 Lei Zhang , David Zhang

Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation

Discriminative representation is essential to keep a unique identifier for each target in Multiple object tracking (MOT). Some recent MOT methods extract features of the bounding box region or the center point as identity embeddings.…

Computer Vision and Pattern Recognition · Computer Science 2023-03-09 Hao Ren , Shoudong Han , Huilin Ding , Ziwen Zhang , Hongwei Wang , Faquan Wang

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization

Compared with global average pooling in existing deep convolutional neural networks (CNNs), global covariance pooling can capture richer statistics of deep features, having potential for improving representation and generalization abilities…

Computer Vision and Pattern Recognition · Computer Science 2020-08-12 Qilong Wang , Jiangtao Xie , Wangmeng Zuo , Lei Zhang , Peihua Li

Bilinear Parameterization For Differentiable Rank-Regularization

Low rank approximation is a commonly occurring problem in many computer vision and machine learning applications. There are two common ways of optimizing the resulting models. Either the set of matrices with a given rank can be explicitly…

Computer Vision and Pattern Recognition · Computer Science 2019-07-24 Marcus Valtonen Örnhag , Carl Olsson , Anders Heyden

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

This paper presents MaVEn, an innovative Multi-granularity Visual Encoding framework designed to enhance the capabilities of Multimodal Large Language Models (MLLMs) in multi-image reasoning. Current MLLMs primarily focus on single-image…

Computation and Language · Computer Science 2024-08-27 Chaoya Jiang , Jia Hongrui , Haiyang Xu , Wei Ye , Mengfan Dong , Ming Yan , Ji Zhang , Fei Huang , Shikun Zhang

Hierarchical Mask-Enhanced Dual Reconstruction Network for Few-Shot Fine-Grained Image Classification

Few-shot fine-grained image classification (FS-FGIC) presents a significant challenge, requiring models to distinguish visually similar subclasses with limited labeled examples. Existing methods have critical limitations: metric-based…

Computer Vision and Pattern Recognition · Computer Science 2025-06-26 Ning Luo , Meiyin Hu , Huan Wan , Yanyan Yang , Zhuohang Jiang , Xin Wei

Machine-learned Regularization and Polygonization of Building Segmentation Masks

We propose a machine learning based approach for automatic regularization and polygonization of building segmentation masks. Taking an image as input, we first predict building segmentation maps exploiting generic fully convolutional…

Computer Vision and Pattern Recognition · Computer Science 2020-12-18 Stefano Zorzi , Ksenia Bittner , Friedrich Fraundorfer

Rethinking Normalization Strategies and Convolutional Kernels for Multimodal Image Fusion

Multimodal image fusion (MMIF) integrates information from different modalities to obtain a comprehensive image, aiding downstream tasks. However, existing research focuses on complementary information fusion and training strategies,…

Computer Vision and Pattern Recognition · Computer Science 2025-12-12 Dan He , Guofen Wang , Weisheng Li , Yucheng Shu , Wenbo Li , Lijian Yang , Yuping Huang , Feiyan Li

Multi-scale Unified Network for Image Classification

Convolutional Neural Networks (CNNs) have advanced significantly in visual representation learning and recognition. However, they face notable challenges in performance and computational efficiency when dealing with real-world, multi-scale…

Computer Vision and Pattern Recognition · Computer Science 2024-03-28 Wenzhuo Liu , Fei Zhu , Cheng-Lin Liu

Multi-scale Orderless Pooling of Deep Convolutional Activation Features

Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of…

Computer Vision and Pattern Recognition · Computer Science 2014-09-10 Yunchao Gong , Liwei Wang , Ruiqi Guo , Svetlana Lazebnik