Playing for Benchmarks

Stephan R. Richter; Zeeshan Hayder; Vladlen Koltun

Playing for Benchmarks

Computer Vision and Pattern Recognition 2017-09-22 v1

Authors: Stephan R. Richter , Zeeshan Hayder , Vladlen Koltun

Abstract

We present a benchmark suite for visual perception. The benchmark is based on more than 250K high-resolution video frames, all annotated with ground-truth data for both low-level and high-level vision tasks, including optical flow, semantic instance segmentation, object detection and tracking, object-level 3D scene layout, and visual odometry. Ground-truth data for all tasks is available for every frame. The data was collected while driving, riding, and walking a total of 184 kilometers in diverse ambient conditions in a realistic virtual world. To create the benchmark, we have developed a new approach to collecting ground-truth data from simulated worlds without access to their source code or content. We conduct statistical analyses that show that the composition of the scenes in the benchmark closely matches the composition of corresponding physical environments. The realism of the collected data is further validated via perceptual experiments. We analyze the performance of state-of-the-art methods for multiple tasks, providing reference baselines and highlighting challenges for future research. The supplementary video can be viewed at https://youtu.be/T9OybWv923Y

Keywords

virtual reality video understanding sports analytics

Cite

@article{arxiv.1709.07322,
  title  = {Playing for Benchmarks},
  author = {Stephan R. Richter and Zeeshan Hayder and Vladlen Koltun},
  journal= {arXiv preprint arXiv:1709.07322},
  year   = {2017}
}

Comments

Published at the International Conference on Computer Vision (ICCV 2017)

Playing for Benchmarks

Abstract

Keywords

Cite

Comments

Related papers