English

Personalizing Pre-trained Models

Computer Vision and Pattern Recognition 2021-06-04 v1

Abstract

Self-supervised or weakly supervised models trained on large-scale datasets have shown sample-efficient transfer to diverse datasets in few-shot settings. We consider how upstream pretrained models can be leveraged for downstream few-shot, multilabel, and continual learning tasks. Our model CLIPPER (CLIP PERsonalized) uses image representations from CLIP, a large-scale image representation learning model trained using weak natural language supervision. We developed a technique, called Multi-label Weight Imprinting (MWI), for multi-label, continual, and few-shot learning, and CLIPPER uses MWI with image representations from CLIP. We evaluated CLIPPER on 10 single-label and 5 multi-label datasets. Our model shows robust and competitive performance, and we set new benchmarks for few-shot, multi-label, and continual learning. Our lightweight technique is also compute-efficient and enables privacy-preserving applications as the data is not sent to the upstream model for fine-tuning.

Keywords

Cite

@article{arxiv.2106.01499,
  title  = {Personalizing Pre-trained Models},
  author = {Mina Khan and P Srivatsa and Advait Rane and Shriram Chenniappa and Asadali Hazariwala and Pattie Maes},
  journal= {arXiv preprint arXiv:2106.01499},
  year   = {2021}
}
R2 v1 2026-06-24T02:46:29.234Z