English

Weakly Supervised Attended Object Detection Using Gaze Data as Annotations

Computer Vision and Pattern Recognition 2022-04-15 v1

Abstract

We consider the problem of detecting and recognizing the objects observed by visitors (i.e., attended objects) in cultural sites from egocentric vision. A standard approach to the problem involves detecting all objects and selecting the one which best overlaps with the gaze of the visitor, measured through a gaze tracker. Since labeling large amounts of data to train a standard object detector is expensive in terms of costs and time, we propose a weakly supervised version of the task which leans only on gaze data and a frame-level label indicating the class of the attended object. To study the problem, we present a new dataset composed of egocentric videos and gaze coordinates of subjects visiting a museum. We hence compare three different baselines for weakly supervised attended object detection on the collected data. Results show that the considered approaches achieve satisfactory performance in a weakly supervised manner, which allows for significant time savings with respect to a fully supervised detector based on Faster R-CNN. To encourage research on the topic, we publicly release the code and the dataset at the following url: https://iplab.dmi.unict.it/WS_OBJ_DET/

Keywords

Cite

@article{arxiv.2204.07090,
  title  = {Weakly Supervised Attended Object Detection Using Gaze Data as Annotations},
  author = {Michele Mazzamuto and Francesco Ragusa and Antonino Furnari and Giovanni Signorello and Giovanni Maria Farinella},
  journal= {arXiv preprint arXiv:2204.07090},
  year   = {2022}
}
R2 v1 2026-06-24T10:48:24.836Z