iMatching: Imperative Correspondence Learning
Abstract
Learning feature correspondence is a foundational task in computer vision, holding immense importance for downstream applications such as visual odometry and 3D reconstruction. Despite recent progress in data-driven models, feature correspondence learning is still limited by the lack of accurate per-pixel correspondence labels. To overcome this difficulty, we introduce a new self-supervised scheme, imperative learning (IL), for training feature correspondence. It enables correspondence learning on arbitrary uninterrupted videos without any camera pose or depth labels, heralding a new era for self-supervised correspondence learning. Specifically, we formulated the problem of correspondence learning as a bilevel optimization, which takes the reprojection error from bundle adjustment as a supervisory signal for the model. To avoid large memory and computation overhead, we leverage the stationary point to effectively back-propagate the implicit gradients through bundle adjustment. Through extensive experiments, we demonstrate superior performance on tasks including feature matching and pose estimation, in which we obtained an average of 30% accuracy gain over the state-of-the-art matching models.
Cite
@article{arxiv.2312.02141,
title = {iMatching: Imperative Correspondence Learning},
author = {Zitong Zhan and Dasong Gao and Yun-Jou Lin and Youjie Xia and Chen Wang},
journal= {arXiv preprint arXiv:2312.02141},
year = {2025}
}
Comments
This preprint corresponds to the Accepted Manuscript in European Conference on Computer Vision (ECCV) 2024