English

Towards Device Efficient Conditional Image Generation

Computer Vision and Pattern Recognition 2022-10-17 v2 Graphics Image and Video Processing

Abstract

We present a novel algorithm to reduce tensor compute required by a conditional image generation autoencoder without sacrificing quality of photo-realistic image generation. Our method is device agnostic, and can optimize an autoencoder for a given CPU-only, GPU compute device(s) in about normal time it takes to train an autoencoder on a generic workstation. We achieve this via a two-stage novel strategy where, first, we condense the channel weights, such that, as few as possible channels are used. Then, we prune the nearly zeroed out weight activations, and fine-tune the autoencoder. To maintain image quality, fine-tuning is done via student-teacher training, where we reuse the condensed autoencoder as the teacher. We show performance gains for various conditional image generation tasks: segmentation mask to face images, face images to cartoonization, and finally CycleGAN-based model over multiple compute devices. We perform various ablation studies to justify the claims and design choices, and achieve real-time versions of various autoencoders on CPU-only devices while maintaining image quality, thus enabling at-scale deployment of such autoencoders.

Keywords

Cite

@article{arxiv.2203.10363,
  title  = {Towards Device Efficient Conditional Image Generation},
  author = {Nisarg A. Shah and Gaurav Bharaj},
  journal= {arXiv preprint arXiv:2203.10363},
  year   = {2022}
}

Comments

British Machine Vision Conference 2022

R2 v1 2026-06-24T10:19:14.355Z