English

Learning Visual-Semantic Subspace Representations

Computer Vision and Pattern Recognition 2025-04-15 v2 Machine Learning

Abstract

Learning image representations that capture rich semantic relationships remains a significant challenge. Existing approaches are either contrastive, lacking robust theoretical guarantees, or struggle to effectively represent the partial orders inherent to structured visual-semantic data. In this paper, we introduce a nuclear norm-based loss function, grounded in the same information theoretic principles that have proved effective in self-supervised learning. We present a theoretical characterization of this loss, demonstrating that, in addition to promoting class orthogonality, it encodes the spectral geometry of the data within a subspace lattice. This geometric representation allows us to associate logical propositions with subspaces, ensuring that our learned representations adhere to a predefined symbolic structure.

Keywords

Cite

@article{arxiv.2405.16213,
  title  = {Learning Visual-Semantic Subspace Representations},
  author = {Gabriel Moreira and Manuel Marques and João Paulo Costeira and Alexander Hauptmann},
  journal= {arXiv preprint arXiv:2405.16213},
  year   = {2025}
}

Comments

The 28th International Conference on Artificial Intelligence and Statistics (AISTATS)

R2 v1 2026-06-28T16:40:09.994Z