English

SimLabel: Similarity-Weighted Iterative Framework for Multi-annotator Learning with Missing Annotations

Multimedia 2025-08-08 v2

Abstract

Multi-annotator learning (MAL) aims to model annotator-specific labeling patterns. However, existing methods face a critical challenge: they simply skip updating annotator-specific model parameters when encountering missing labels, i.e., a common scenario in real-world crowdsourced datasets where each annotator labels only small subsets of samples. This leads to inefficient data utilization and overfitting risks. To this end, we propose a novel similarity-weighted semi-supervised learning framework (SimLabel) that leverages inter-annotator similarities to generate weighted soft labels for missing annotations, enabling the utilization of unannotated samples rather than skipping them entirely. We further introduce a confidence-based iterative refinement mechanism that combines maximum probability with entropy-based uncertainty to prioritize predicted high-quality pseudo-labels to impute missing labels, jointly enhancing similarity estimation and model performance over time. For evaluation, we contribute a new multimodal multi-annotator dataset, AMER2, with high and more variable missing rates, reflecting real-world annotation sparsity and enabling evaluation across different sparsity levels.

Keywords

Cite

@article{arxiv.2504.09525,
  title  = {SimLabel: Similarity-Weighted Iterative Framework for Multi-annotator Learning with Missing Annotations},
  author = {Liyun Zhang and Zheng Lian and Hong Liu and Takanori Takebe and Yuta Nakashima},
  journal= {arXiv preprint arXiv:2504.09525},
  year   = {2025}
}

Comments

9 pages