English

Pluralistic Image Completion

Computer Vision and Pattern Recognition 2019-04-08 v2

Abstract

Most image completion methods produce only one result for each masked input, although there may be many reasonable possibilities. In this paper, we present an approach for \textbf{pluralistic image completion} -- the task of generating multiple and diverse plausible solutions for image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label. As such, sampling from conditional VAEs still leads to minimal diversity. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one given ground truth to get prior distribution of missing parts and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by GANs. We also introduce a new short+long term attention layer that exploits distant relations among decoder and encoder features, improving appearance consistency. When tested on datasets with buildings (Paris), faces (CelebA-HQ), and natural images (ImageNet), our method not only generated higher-quality completion results, but also with multiple and diverse plausible outputs.

Keywords

Cite

@article{arxiv.1903.04227,
  title  = {Pluralistic Image Completion},
  author = {Chuanxia Zheng and Tat-Jen Cham and Jianfei Cai},
  journal= {arXiv preprint arXiv:1903.04227},
  year   = {2019}
}

Comments

21 pages, 16 figures

R2 v1 2026-06-23T08:04:05.299Z