Related papers: Cross-Domain Document Layout Analysis Using Docume…

A Graphical Approach to Document Layout Analysis

Document layout analysis (DLA) is the task of detecting the distinct, semantic content within a document and correctly classifying these items into an appropriate category (e.g., text, title, figure). DLA pipelines enable users to convert…

Machine Learning · Computer Science 2023-08-07 Jilin Wang , Michael Krumdick , Baojia Tong , Hamima Halim , Maxim Sokolov , Vadym Barda , Delphine Vendryes , Chris Tanner

Graph-based Document Structure Analysis

When reading a document, glancing at the spatial layout of a document is an initial step to understand it roughly. Traditional document layout analysis (DLA) methods, however, offer only a superficial parsing of documents, focusing on basic…

Computer Vision and Pattern Recognition · Computer Science 2025-02-05 Yufan Chen , Ruiping Liu , Junwei Zheng , Di Wen , Kunyu Peng , Jiaming Zhang , Rainer Stiefelhagen

UnSupDLA: Towards Unsupervised Document Layout Analysis

Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Talha Uddin Sheikh , Tahira Shehzadi , Khurram Azeem Hashmi , Didier Stricker , Muhammad Zeshan Afzal

SFDLA: Source-Free Document Layout Analysis

Document Layout Analysis (DLA) is a fundamental task in document understanding. However, existing DLA and adaptation methods often require access to large-scale source data and target labels. This requirements severely limiting their…

Computer Vision and Pattern Recognition · Computer Science 2025-06-19 Sebastian Tewes , Yufan Chen , Omar Moured , Jiaming Zhang , Rainer Stiefelhagen

Document Layout Analysis via Dynamic Residual Feature Fusion

The document layout analysis (DLA) aims to split the document image into different interest regions and understand the role of each region, which has wide application such as optical character recognition (OCR) systems and document…

Computer Vision and Pattern Recognition · Computer Science 2022-02-15 Xingjiao Wu , Ziling Hu , Xiangcheng Du , Jing Yang , Liang He

Document Layout Analysis with Aesthetic-Guided Image Augmentation

Document layout analysis (DLA) plays an important role in information extraction and document understanding. At present, document layout analysis has reached a milestone achievement, however, document layout analysis of non-Manhattan is…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Tianlong Ma , Xingjiao Wu , Xin Li , Xiangcheng Du , Zhao Zhou , Liang Xue , Cheng Jin

The COTe score: A decomposable framework for evaluating Document Layout Analysis models

Document Layout analysis (DLA), is the process by which a page is parsed into meaningful elements, often using machine learning models. Typically, the quality of a model is judged using general object detection metrics such as IoU, F1 or…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Jonathan Bourne , Mwiza Simbeye , Ishtar Govia

HybriDLA: Hybrid Generation for Document Layout Analysis

Conventional document layout analysis (DLA) traditionally depends on empirical priors or a fixed set of learnable queries executed in a single forward pass. While sufficient for early-generation documents with a small, predetermined number…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Yufan Chen , Omar Moured , Ruiping Liu , Junwei Zheng , Kunyu Peng , Jiaming Zhang , Rainer Stiefelhagen

Cross-Domain Document Object Detection: Benchmark Suite and Method

Decomposing images of document pages into high-level semantic regions (e.g., figures, tables, paragraphs), document object detection (DOD) is fundamental for downstream tasks like intelligent document editing and understanding. DOD remains…

Computer Vision and Pattern Recognition · Computer Science 2020-03-31 Kai Li , Curtis Wigington , Chris Tensmeyer , Handong Zhao , Nikolaos Barmpalios , Vlad I. Morariu , Varun Manjunatha , Tong Sun , Yun Fu

A Hybrid Approach for Document Layout Analysis in Document images

Document layout analysis involves understanding the arrangement of elements within a document. This paper navigates the complexities of understanding various elements within document images, such as text, images, tables, and headings. The…

Computer Vision and Pattern Recognition · Computer Science 2024-05-02 Tahira Shehzadi , Didier Stricker , Muhammad Zeshan Afzal

Human-In-The-Loop Document Layout Analysis

Document layout analysis (DLA) aims to divide a document image into different types of regions. DLA plays an important role in the document content understanding and information extraction systems. Exploring a method that can use less data…

Computer Vision and Pattern Recognition · Computer Science 2021-08-05 Xingjiao Wu , Tianlong Ma , Xin Li , Qin Chen , Liang He

Cross-Domain Labeled LDA for Cross-Domain Text Classification

Cross-domain text classification aims at building a classifier for a target domain which leverages data from both source and target domain. One promising idea is to minimize the feature distribution differences of the two domains. Most…

Computation and Language · Computer Science 2019-01-07 Baoyu Jing , Chenwei Lu , Deqing Wang , Fuzhen Zhuang , Cheng Niu

DLAFormer: An End-to-End Transformer For Document Layout Analysis

Document layout analysis (DLA) is crucial for understanding the physical layout and logical structure of documents, serving information retrieval, document summarization, knowledge extraction, etc. However, previous studies have typically…

Computer Vision and Pattern Recognition · Computer Science 2024-05-21 Jiawei Wang , Kai Hu , Qiang Huo

OmniDocLayout: Towards Diverse Document Layout Generation via Coarse-to-Fine LLM Learning

Document AI has advanced rapidly and is attracting increasing attention. Yet, while most efforts have focused on document layout analysis (DLA), its generative counterpart, layout generation, remains underexplored. Distinct from traditional…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Hengrui Kang , Zhuangcheng Gu , Zhiyuan Zhao , Zichen Wen , Bin Wang , Weijia Li , Conghui He

VTLayout: Fusion of Visual and Text Features for Document Layout Analysis

Documents often contain complex physical structures, which make the Document Layout Analysis (DLA) task challenging. As a pre-processing step for content extraction, DLA has the potential to capture rich information in historical or…

Information Retrieval · Computer Science 2021-08-31 Shoubin Li , Xuyan Ma , Shuaiqun Pan , Jun Hu , Lin Shi , Qing Wang

Vision Grid Transformer for Document Layout Analysis

Document pre-trained models and grid-based models have proven to be very effective on various tasks in Document AI. However, for the document layout analysis (DLA) task, existing document pre-trained models, even those pre-trained in a…

Computer Vision and Pattern Recognition · Computer Science 2023-08-30 Cheng Da , Chuwei Luo , Qi Zheng , Cong Yao

LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis

Recent advances in Large Language Models (LLMs) and Large Multimodal Models (LMMs) have improved Document Layout Analysis (DLA), yet structural errors such as region merging, splitting, and omission remain persistent. Conventional…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Inbum Heo , Taewook Hwang , Jeesu Jung , Sangkeun Jung

Cross-domain Contrastive Learning for Unsupervised Domain Adaptation

Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain. Most existing UDA methods learn domain-invariant feature representations by minimizing…

Computer Vision and Pattern Recognition · Computer Science 2022-05-10 Rui Wang , Zuxuan Wu , Zejia Weng , Jingjing Chen , Guo-Jun Qi , Yu-Gang Jiang

Improved Techniques for Adversarial Discriminative Domain Adaptation

Adversarial discriminative domain adaptation (ADDA) is an efficient framework for unsupervised domain adaptation in image classification, where the source and target domains are assumed to have the same classes, but no labels are available…

Computer Vision and Pattern Recognition · Computer Science 2019-11-12 Aaron Chadha , Yiannis Andreopoulos

Document Layout Annotation: Database and Benchmark in the Domain of Public Affairs

Every day, thousands of digital documents are generated with useful information for companies, public organizations, and citizens. Given the impossibility of processing them manually, the automatic processing of these documents is becoming…

Information Retrieval · Computer Science 2023-09-06 Alejandro Peña , Aythami Morales , Julian Fierrez , Javier Ortega-Garcia , Marcos Grande , Iñigo Puente , Jorge Cordova , Gonzalo Cordova