Deformable 3D Convolution for Video Super-Resolution

Xinyi Ying; Longguang Wang; Yingqian Wang; Weidong Sheng; Wei An; Yulan Guo

doi:10.1109/LSP.2020.3013518

Deformable 3D Convolution for Video Super-Resolution

Computer Vision and Pattern Recognition 2021-11-29 v5

Authors: Xinyi Ying , Longguang Wang , Yingqian Wang , Weidong Sheng , Wei An , Yulan Guo

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

The spatio-temporal information among video sequences is significant for video super-resolution (SR). However, the spatio-temporal information cannot be fully used by existing video SR methods since spatial feature extraction and temporal motion compensation are usually performed sequentially. In this paper, we propose a deformable 3D convolution network (D3Dnet) to incorporate spatio-temporal information from both spatial and temporal dimensions for video SR. Specifically, we introduce deformable 3D convolution (D3D) to integrate deformable convolution with 3D convolution, obtaining both superior spatio-temporal modeling capability and motion-aware modeling flexibility. Extensive experiments have demonstrated the effectiveness of D3D in exploiting spatio-temporal information. Comparative results show that our network achieves state-of-the-art SR performance. Code is available at: https://github.com/XinyiYing/D3Dnet.

Keywords

action recognition image super-resolution video generation

Cite

@article{arxiv.2004.02803,
  title  = {Deformable 3D Convolution for Video Super-Resolution},
  author = {Xinyi Ying and Longguang Wang and Yingqian Wang and Weidong Sheng and Wei An and Yulan Guo},
  journal= {arXiv preprint arXiv:2004.02803},
  year   = {2021}
}

Comments

Accepted by IEEE Signal Processing Letters

Deformable 3D Convolution for Video Super-Resolution

Abstract

Keywords

Cite

Comments

Related papers