English

Multi-Instance Visual-Semantic Embedding

Computer Vision and Pattern Recognition 2015-12-23 v1

Abstract

Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space. Although several approaches have been proposed for single-label embedding tasks, handling images with multiple labels (which is a more general setting) still remains an open problem, mainly due to the complex underlying corresponding relationship between image and its labels. In this work, we present Multi-Instance visual-semantic Embedding model (MIE) for embedding images associated with either single or multiple labels. Our model discovers and maps semantically-meaningful image subregions to their corresponding labels. And we demonstrate the superiority of our method over the state-of-the-art on two tasks, including multi-label image annotation and zero-shot learning.

Keywords

Cite

@article{arxiv.1512.06963,
  title  = {Multi-Instance Visual-Semantic Embedding},
  author = {Zhou Ren and Hailin Jin and Zhe Lin and Chen Fang and Alan Yuille},
  journal= {arXiv preprint arXiv:1512.06963},
  year   = {2015}
}

Comments

9 pages, CVPR 2016 submission

R2 v1 2026-06-22T12:15:37.977Z