Related papers: Geometric Representation Learning for Document Ima…

ForCenNet: Foreground-Centric Network for Document Image Rectification

Document image rectification aims to eliminate geometric deformation in photographed documents to facilitate text recognition. However, existing methods often neglect the significance of foreground elements, which provide essential…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Peng Cai , Qiang Li , Kaicheng Yang , Dong Guo , Jia Li , Nan Zhou , Xiang An , Ninghua Yang , Jiankang Deng

Geometric Rectification of Creased Document Images based on Isometric Mapping

Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR). Although smoothly curved deformations have been widely investigated by many works, the most…

Computer Vision and Pattern Recognition · Computer Science 2022-12-19 Dong Luo , Pengbo Bo

DocScanner: Robust Document Image Rectification with Progressive Learning

Compared with flatbed scanners, portable smartphones provide more convenience for physical document digitization. However, such digitized documents are often distorted due to uncontrolled physical deformations, camera positions, and…

Computer Vision and Pattern Recognition · Computer Science 2022-12-27 Hao Feng , Wengang Zhou , Jiajun Deng , Qi Tian , Houqiang Li

BookNet: Book Image Rectification via Cross-Page Attention Network

Book image rectification presents unique challenges in document image processing due to complex geometric distortions from binding constraints, where left and right pages exhibit distinctly asymmetric curvature patterns. However, existing…

Computer Vision and Pattern Recognition · Computer Science 2026-01-30 Shaokai Liu , Hao Feng , Bozhi Luan , Min Hou , Jiajun Deng , Wengang Zhou

GDCNet: Calibrationless geometric distortion correction of echo planar imaging data using deep learning

Functional magnetic resonance imaging techniques benefit from echo-planar imaging's fast image acquisition but are susceptible to inhomogeneities in the main magnetic field, resulting in geometric distortion and signal loss artifacts in the…

Image and Video Processing · Electrical Eng. & Systems 2024-03-01 Marina Manso Jimeno , Keren Bachi , George Gardner , Yasmin L. Hurd , John Thomas Vaughan , Sairam Geethanath

Self-Supervised Image Representation Learning with Geometric Set Consistency

We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency. Our intuition is that 3D geometric consistency priors such as smooth regions and surface discontinuities may imply…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Nenglun Chen , Lei Chu , Hao Pan , Yan Lu , Wenping Wang

Efficient Document Image Dewarping via Hybrid Deep Learning and Cubic Polynomial Geometry Restoration

Camera-captured document images often suffer from geometric distortions caused by paper deformation, perspective distortion, and lens aberrations, significantly reducing OCR accuracy. This study develops an efficient automated method for…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 Valery Istomin , Oleg Pereziabov , Ilya Afanasyev

A Survey on Deep Geometry Learning: From a Representation Perspective

Researchers have now achieved great success on dealing with 2D images using deep learning. In recent years, 3D computer vision and Geometry Deep Learning gain more and more attention. Many advanced techniques for 3D shapes have been…

Graphics · Computer Science 2020-04-16 Yun-Peng Xiao , Yu-Kun Lai , Fang-Lue Zhang , Chunpeng Li , Lin Gao

D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping

Document image dewarping remains a challenging task in the deep learning era. While existing methods have improved by leveraging text line awareness, they typically focus only on a single horizontal dimension. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2026-03-05 Heng Li , Xiangping Wu , Qingcai Chen

DocMAE: Document Image Rectification via Self-supervised Representation Learning

Tremendous efforts have been made on document image rectification, but how to learn effective representation of such distorted images is still under-explored. In this paper, we present DocMAE, a novel self-supervised framework for document…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Shaokai Liu , Hao Feng , Wengang Zhou , Houqiang Li , Cong Liu , Feng Wu

Deep Unrestricted Document Image Rectification

In recent years, tremendous efforts have been made on document image rectification, but existing advanced algorithms are limited to processing restricted document images, i.e., the input images must incorporate a complete document. Once the…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Hao Feng , Shaokai Liu , Jiajun Deng , Wengang Zhou , Houqiang Li

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

In this work, we propose a new framework, called Document Image Transformer (DocTr), to address the issue of geometry and illumination distortion of the document images. Specifically, DocTr consists of a geometric unwarping transformer and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-11 Hao Feng , Yuechen Wang , Wengang Zhou , Jiajun Deng , Houqiang Li

3D-GMNet: Single-View 3D Shape Recovery as A Gaussian Mixture

In this paper, we introduce 3D-GMNet, a deep neural network for 3D object shape reconstruction from a single image. As the name suggests, 3D-GMNet recovers 3D shape as a Gaussian mixture. In contrast to voxels, point clouds, or meshes, a…

Computer Vision and Pattern Recognition · Computer Science 2020-08-18 Kohei Yamashita , Shohei Nobuhara , Ko Nishino

Fourier Document Restoration for Robust Document Dewarping and Recognition

State-of-the-art document dewarping techniques learn to predict 3-dimensional information of documents which are prone to errors while dealing with documents with irregular distortions or large variations in depth. This paper presents…

Computer Vision and Pattern Recognition · Computer Science 2022-03-21 Chuhui Xue , Zichen Tian , Fangneng Zhan , Shijian Lu , Song Bai

PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph

Ubiquitous geometric objects can be precisely and efficiently represented as polyhedra. The transformation of a polyhedron into a vector, known as polyhedra representation learning, is crucial for manipulating these shapes with mathematical…

Computer Vision and Pattern Recognition · Computer Science 2025-02-20 Dazhou Yu , Genpei Zhang , Liang Zhao

Blind Geometric Distortion Correction on Images Through Deep Learning

We propose the first general framework to automatically correct different types of geometric distortion in a single input image. Our proposed method employs convolutional neural networks (CNNs) trained by using a large synthetic distortion…

Computer Vision and Pattern Recognition · Computer Science 2019-09-10 Xiaoyu Li , Bo Zhang , Pedro V. Sander , Jing Liao

Self-Attention Based Multi-Scale Graph Auto-Encoder Network of 3D Meshes

3D meshes are fundamental data representations for capturing complex geometric shapes in computer vision and graphics applications. While Convolutional Neural Networks (CNNs) have excelled in structured data like images, extending them to…

Graphics · Computer Science 2025-07-09 Saqib Nazir , Olivier Lézoray , Sébastien Bougleux

GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding

This paper presents GeoContrastNet, a language-agnostic framework to structured document understanding (DU) by integrating a contrastive learning objective with graph attention networks (GATs), emphasizing the significant role of geometric…

Computer Vision and Pattern Recognition · Computer Science 2024-05-07 Nil Biescas , Carlos Boned , Josep Lladós , Sanket Biswas

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Videos inherently represent 2D projections of a dynamic 3D world. However, our analysis suggests that video diffusion models trained solely on raw video data often fail to capture meaningful geometric-aware structure in their learned…

Computer Vision and Pattern Recognition · Computer Science 2026-05-06 Haoyu Wu , Diankun Wu , Tianyu He , Junliang Guo , Yang Ye , Yueqi Duan , Jiang Bian

Deep Learned Full-3D Object Completion from Single View

3D geometry is a very informative cue when interacting with and navigating an environment. This writing proposes a new approach to 3D reconstruction and scene understanding, which implicitly learns 3D geometry from depth maps pairing a deep…

Computer Vision and Pattern Recognition · Computer Science 2018-08-22 Dario Rethage , Federico Tombari , Felix Achilles , Nassir Navab