Computer Vision and Pattern Recognition · Computer Science
VRD-IU: Lessons from Visually Rich Document Intelligence and Understanding
Yihao Ding, Soyeon Caren Han, Yan Li, Josiah Poon
2025-06-03
Computation and Language · Computer Science
Deep Learning based Visually Rich Document Content Understanding: A Survey
Yihao Ding, Soyeon Caren Han, Jean Lee, Eduard Hovy
2025-06-23
Artificial Intelligence · Computer Science
RDU: A Region-based Approach to Form-style Document Understanding
Fengbin Zhu, Chao Wang, Wenqiang Lei, Ziyang Liu +1
2022-06-15
Computer Vision and Pattern Recognition · Computer Science
ReLayout: Towards Real-World Document Understanding via Layout-enhanced Pre-training
Zhouqiang Jiang, Bowen Wang, Junhao Chen, Yuta Nakashima
2024-10-17
Computer Vision and Pattern Recognition · Computer Science
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends
Yihao Ding, Siwen Luo, Yue Dai, Yanbei Jiang +5
2026-04-22
Computation and Language · Computer Science
VRDU: A Benchmark for Visually-rich Document Understanding
Zilong Wang, Yichao Zhou, Wei Wei, Chen-Yu Lee +1
2023-09-19
Computation and Language · Computer Science
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding
Junlong Li, Yiheng Xu, Lei Cui, Furu Wei
2022-03-14
Computer Vision and Pattern Recognition · Computer Science
Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers
Mélodie Boillet, Martin Maarand, Thierry Paquet, Christopher Kermorvant
2021-11-03
Human-Computer Interaction · Computer Science
DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading
Hao Wang, Qingxuan Wang, Yue Li, Changqing Wang +2
2023-10-24
Computer Vision and Pattern Recognition · Computer Science
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal
2024-02-21
Computer Vision and Pattern Recognition · Computer Science
Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang
2020-03-17
Computer Vision and Pattern Recognition · Computer Science
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
Sungnyun Kim, Haofu Liao, Srikar Appalaraju, Peng Tang +5
2024-10-07
Computer Vision and Pattern Recognition · Computer Science
Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes
Hassan Abu Alhaija, Siva Karthik Mustikovela, Lars Mescheder, Andreas Geiger +1
2017-08-07
Information Retrieval · Computer Science
Document Expansion by Query Prediction
Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho
2019-09-26
Computation and Language · Computer Science
Document Optimization for Black-Box Retrieval via Reinforcement Learning
Omri Uzan, Ron Polonsky, Douwe Kiela, Christopher Potts
2026-04-08
Computer Vision and Pattern Recognition · Computer Science
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding
Chuwei Luo, Guozhi Tang, Qi Zheng, Cong Yao +4
2025-06-19
Computer Vision and Pattern Recognition · Computer Science
MATrIX -- Modality-Aware Transformer for Information eXtraction
Thomas Delteil, Edouard Belval, Lei Chen, Luis Goncalves +1
2022-05-18
Computer Vision and Pattern Recognition · Computer Science
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang, Zhengjie Huang, Bin Luo, Qianglong Chen +7
2022-09-20
Information Retrieval · Computer Science
Improving Document Image Understanding with Reinforcement Finetuning
Bao-Sinh Nguyen, Dung Tien Le, Hieu M. Vu, Tuan Anh D. Nguyen +2
2022-09-27
Computer Vision and Pattern Recognition · Computer Science
Dynamic Objects Segmentation for Visual Localization in Urban Environments
Guoxiang Zhou, Berta Bescos, Marcin Dymczyk, Mark Pfeiffer +2
2018-07-11