English

Future Segmentation Using 3D Structure

Computer Vision and Pattern Recognition 2018-11-29 v1

Abstract

Predicting the future to anticipate the outcome of events and actions is a critical attribute of autonomous agents; particularly for agents which must rely heavily on real time visual data for decision making. Working towards this capability, we address the task of predicting future frame segmentation from a stream of monocular video by leveraging the 3D structure of the scene. Our framework is based on learnable sub-modules capable of predicting pixel-wise scene semantic labels, depth, and camera ego-motion of adjacent frames. We further propose a recurrent neural network based model capable of predicting future ego-motion trajectory as a function of a series of past ego-motion steps. Ultimately, we observe that leveraging 3D structure in the model facilitates successful prediction, achieving state of the art accuracy in future semantic segmentation.

Keywords

Cite

@article{arxiv.1811.11358,
  title  = {Future Segmentation Using 3D Structure},
  author = {Suhani Vora and Reza Mahjourian and Soeren Pirk and Anelia Angelova},
  journal= {arXiv preprint arXiv:1811.11358},
  year   = {2018}
}