English
Related papers

Related papers: Graph-based Document Structure Analysis

200 papers

Document layout analysis (DLA) is the task of detecting the distinct, semantic content within a document and correctly classifying these items into an appropriate category (e.g., text, title, figure). DLA pipelines enable users to convert…

Machine Learning · Computer Science 2023-08-07 Jilin Wang , Michael Krumdick , Baojia Tong , Hamima Halim , Maxim Sokolov , Vadym Barda , Delphine Vendryes , Chris Tanner

Document structure analysis (aka document layout analysis) is crucial for understanding the physical layout and logical structure of documents, with applications in information retrieval, document summarization, knowledge extraction, etc.…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 Jiawei Wang , Kai Hu , Zhuoyao Zhong , Lei Sun , Qiang Huo

The document layout analysis (DLA) aims to decompose document images into high-level semantic areas (i.e., figures, tables, texts, and background). Creating a DLA framework with strong generalization capabilities is a challenge due to…

Computer Vision and Pattern Recognition · Computer Science 2024-07-24 Xingjiao Wu , Luwei Xiao , Xiangcheng Du , Yingbin Zheng , Xin Li , Tianlong Ma , Cheng Jin , Liang He

Document structure analysis, aka document layout analysis, is crucial for understanding both the physical layout and logical structure of documents, serving information retrieval, document summarization, knowledge extraction, etc.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-27 Jiawei Wang , Kai Hu , Qiang Huo

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications. Recent studies in Document Layout Analysis usually rely on computer…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Siwen Luo , Yihao Ding , Siqu Long , Josiah Poon , Soyeon Caren Han

The automatic analysis of document layouts in digital-born PDF documents remains a challenging problem due to the heterogeneous arrangement of textual and nontextual elements and the imprecision of the textual metadata in the Portable…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Miguel Lopez-Duran , Julian Fierrez , Aythami Morales , Ruben Tolosana , Oscar Delgado-Mohatar , Alvaro Ortigosa

Document layout analysis (DLA) is crucial for understanding the physical layout and logical structure of documents, serving information retrieval, document summarization, knowledge extraction, etc. However, previous studies have typically…

Computer Vision and Pattern Recognition · Computer Science 2024-05-21 Jiawei Wang , Kai Hu , Qiang Huo

In recent years, the use of multi-modal pre-trained Transformers has led to significant advancements in visually-rich document understanding. However, existing models have mainly focused on features such as text and vision while neglecting…

Computation and Language · Computer Science 2023-08-16 Qiwei Li , Zuchao Li , Xiantao Cai , Bo Du , Hai Zhao

Every day, thousands of digital documents are generated with useful information for companies, public organizations, and citizens. Given the impossibility of processing them manually, the automatic processing of these documents is becoming…

Advances in Visually Rich Document Understanding (VrDU) have enabled information extraction and question answering over documents with complex layouts. Two tropes of architectures have emerged -- transformer-based models inspired by LLMs,…

Computation and Language · Computer Science 2024-01-08 Dongsheng Wang , Zhiqiang Ma , Armineh Nourbakhsh , Kang Gu , Sameena Shah

Document Layout analysis (DLA), is the process by which a page is parsed into meaningful elements, often using machine learning models. Typically, the quality of a model is judged using general object detection metrics such as IoU, F1 or…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Jonathan Bourne , Mwiza Simbeye , Ishtar Govia

The problem of document structure reconstruction refers to converting digital or scanned documents into corresponding semantic structures. Most existing works mainly focus on splitting the boundary of each element in a single document page,…

Computation and Language · Computer Science 2023-03-27 Jiefeng Ma , Jun Du , Pengfei Hu , Zhenrong Zhang , Jianshu Zhang , Huihui Zhu , Cong Liu

Document layout analysis usually relies on computer vision models to understand documents while ignoring textual information that is vital to capture. Meanwhile, high quality labeled datasets with both visual and textual information are…

Computation and Language · Computer Science 2020-11-12 Minghao Li , Yiheng Xu , Lei Cui , Shaohan Huang , Furu Wei , Zhoujun Li , Ming Zhou

Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-readable representations, enabling downstream applications such as knowledge base construction and retrieval-augmented generation (RAG).…

Conventional document layout analysis (DLA) traditionally depends on empirical priors or a fixed set of learnable queries executed in a single forward pass. While sufficient for early-generation documents with a small, predetermined number…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Yufan Chen , Omar Moured , Ruiping Liu , Junwei Zheng , Kunyu Peng , Jiaming Zhang , Rainer Stiefelhagen

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions. We proposed a new document-based VQA dataset, PDF-VQA, to comprehensively examine the document…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Yihao Ding , Siwen Luo , Hyunsuk Chung , Soyeon Caren Han

Document intelligence as a relatively new research topic supports many business applications. Its main task is to automatically read, understand, and analyze documents. However, due to the diversity of formats (invoices, reports, forms,…

Computer Vision and Pattern Recognition · Computer Science 2022-10-25 Zhenrong Zhang , Jiefeng Ma , Jun Du , Licheng Wang , Jianshu Zhang

Analyzing textual data is a very challenging task because of the huge volume of data generated daily. Fundamental issues in text analysis include the lack of structure in document datasets, the need for various preprocessing steps %(e.g.,…

Databases · Computer Science 2016-12-20 Ciprian-Octavian Truică , Jérôme Darmont , Julien Velcin

Before developing a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-22 Yufan Chen , Jiaming Zhang , Kunyu Peng , Junwei Zheng , Ruiping Liu , Philip Torr , Rainer Stiefelhagen

In document classification, graph-based models effectively capture document structure, overcoming sequence length limitations and enhancing contextual understanding. However, most existing graph document representations rely on heuristics,…

Computation and Language · Computer Science 2025-08-05 Margarita Bugueño , Gerard de Melo
‹ Prev 1 2 3 10 Next ›