Poly-View Contrastive Learning

Amitis Shidani; Devon Hjelm; Jason Ramapuram; Russ Webb; Eeshan Gunesh Dhekane; Dan Busbridge

Poly-View Contrastive Learning

Machine Learning 2024-03-11 v1 Artificial Intelligence Computer Vision and Pattern Recognition Information Theory math.IT Machine Learning

Authors: Amitis Shidani , Devon Hjelm , Jason Ramapuram , Russ Webb , Eeshan Gunesh Dhekane , Dan Busbridge

View on arXiv ↗ PDF ↗

Abstract

Contrastive learning typically matches pairs of related views among a number of unrelated negative views. Views can be generated (e.g. by augmentations) or be observed. We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show that with unlimited computation, one should maximize the number of related views, and with a fixed compute budget, it is beneficial to decrease the number of unique samples whilst increasing the number of views of those samples. In particular, poly-view contrastive models trained for 128 epochs with batch size 256 outperform SimCLR trained for 1024 epochs at batch size 4096 on ImageNet1k, challenging the belief that contrastive models require large batch sizes and many training epochs.

Keywords

contrastive learning classification machine learning

Cite

@article{arxiv.2403.05490,
  title  = {Poly-View Contrastive Learning},
  author = {Amitis Shidani and Devon Hjelm and Jason Ramapuram and Russ Webb and Eeshan Gunesh Dhekane and Dan Busbridge},
  journal= {arXiv preprint arXiv:2403.05490},
  year   = {2024}
}

Comments

Accepted to ICLR 2024. 42 pages, 7 figures, 3 tables, loss pseudo-code included in appendix

Poly-View Contrastive Learning

Abstract

Keywords

Cite

Comments

Related papers