This paper investigates a novel problem of generating images from visual attributes. We model the image as a composite of foreground and background and develop a layered generative model with disentangled latent variables that can be learned end-to-end using a variational auto-encoder. We experiment with natural images of faces and birds and demonstrate that the proposed models are capable of generating realistic and diverse samples with disentangled latent representations. We use a general energy minimization algorithm for posterior inference of latent variables given novel images. Therefore, the learned generative models show excellent quantitative and visual results in the tasks of attribute-conditioned image reconstruction and completion.
@article{arxiv.1512.00570,
title = {Attribute2Image: Conditional Image Generation from Visual Attributes},
author = {Xinchen Yan and Jimei Yang and Kihyuk Sohn and Honglak Lee},
journal= {arXiv preprint arXiv:1512.00570},
year = {2016}
}
Comments
19 pages, accepted by ECCV 2016, The 14th European Conference on Computer Vision (2016)