English

Evolving Image Compositions for Feature Representation Learning

Computer Vision and Pattern Recognition 2022-04-04 v2 Machine Learning Neural and Evolutionary Computing

Abstract

Convolutional neural networks for visual recognition require large amounts of training samples and usually benefit from data augmentation. This paper proposes PatchMix, a data augmentation method that creates new samples by composing patches from pairs of images in a grid-like pattern. These new samples are assigned label scores that are proportional to the number of patches borrowed from each image. We then add a set of additional losses at the patch-level to regularize and to encourage good representations at both the patch and image levels. A ResNet-50 model trained on ImageNet using PatchMix exhibits superior transfer learning capabilities across a wide array of benchmarks. Although PatchMix can rely on random pairings and random grid-like patterns for mixing, we explore evolutionary search as a guiding strategy to jointly discover optimal grid-like patterns and image pairings. For this purpose, we conceive a fitness function that bypasses the need to re-train a model to evaluate each possible choice. In this way, PatchMix outperforms a base model on CIFAR-10 (+1.91), CIFAR-100 (+5.31), Tiny Imagenet (+3.52), and ImageNet (+1.16).

Keywords

Cite

@article{arxiv.2106.09011,
  title  = {Evolving Image Compositions for Feature Representation Learning},
  author = {Paola Cascante-Bonilla and Arshdeep Sekhon and Yanjun Qi and Vicente Ordonez},
  journal= {arXiv preprint arXiv:2106.09011},
  year   = {2022}
}

Comments

Accepted to BMVC 2021. Camera-Ready version. Project page: https://paolacascante.com/patchmix/index.html