English
Related papers

Related papers: Workshop on Document Intelligence Understanding

200 papers

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions. We proposed a new document-based VQA dataset, PDF-VQA, to comprehensively examine the document…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Yihao Ding , Siwen Luo , Hyunsuk Chung , Soyeon Caren Han

Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their…

Information Retrieval · Computer Science 2023-04-04 Rubèn Tito , Dimosthenis Karatzas , Ernest Valveny

Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques for automatically reading, understanding, and analyzing business documents. It is an important research direction for natural language…

Computation and Language · Computer Science 2021-11-17 Lei Cui , Yiheng Xu , Tengchao Lv , Furu Wei

Extracting key information from documents represents a large portion of business workloads and therefore offers a high potential for efficiency improvements and process automation. With recent advances in Deep Learning, a plethora of Deep…

Information Retrieval · Computer Science 2025-07-21 Alexander Michael Rombach , Peter Fettke

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports, poses challenges not present in short documents. Such large documents may be…

Computation and Language · Computer Science 2019-10-10 Muhammad Mahbubur Rahman , Tim Finin

Since real-world ubiquitous documents (e.g., invoices, tickets, resumes and leaflets) contain rich information, automatic document image understanding has become a hot topic. Most existing works decouple the problem into two separate tasks,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Peng Zhang , Yunlu Xu , Zhanzhan Cheng , Shiliang Pu , Jing Lu , Liang Qiao , Yi Niu , Fei Wu

Understanding documents is central to many real-world tasks but remains a challenging topic. Unfortunately, there is no well-established consensus on how to comprehensively evaluate document understanding abilities, which significantly…

Computation and Language · Computer Science 2023-05-17 Ruoxi Xu , Hongyu Lin , Xinyan Guan , Xianpei Han , Yingfei Sun , Le Sun

Documents are a core part of many businesses in many fields such as law, finance, and technology among others. Automatic understanding of documents such as invoices, contracts, and resumes is lucrative, opening up many new avenues of…

Computation and Language · Computer Science 2021-02-08 Nishant Subramani , Alexandre Matton , Malcolm Greaves , Adrian Lam

Visual document understanding (VDU) is a challenging task that involves understanding documents across various modalities (text and image) and layouts (forms, tables, etc.). This study aims to enhance generalizability of small VDU models by…

Computer Vision and Pattern Recognition · Computer Science 2024-10-07 Sungnyun Kim , Haofu Liao , Srikar Appalaraju , Peng Tang , Zhuowen Tu , Ravi Kumar Satzoda , R. Manmatha , Vijay Mahadevan , Stefano Soatto

Document-level information extraction (IE) is a crucial task in natural language processing (NLP). This paper conducts a systematic review of recent document-level IE literature. In addition, we conduct a thorough error analysis with…

Computation and Language · Computer Science 2023-09-26 Hanwen Zheng , Sijia Wang , Lifu Huang

In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning…

Databases · Computer Science 2018-06-14 Markus Schröder , Christian Jilek , Jörn Hees , Andreas Dengel

Research in Document Intelligence and especially in Document Key Information Extraction (DocKIE) has been mainly solved as Token Classification problem. Recent breakthroughs in both natural language processing (NLP) and computer vision…

Computation and Language · Computer Science 2023-04-24 Laurent Lam , Pirashanth Ratnamogan , Joël Tang , William Vanhuffel , Fabien Caspani

Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles. Existing studies primarily focus on real-world…

Computer Vision and Pattern Recognition · Computer Science 2024-04-22 Yihao Ding , Kaixuan Ren , Jiabin Huang , Siwen Luo , Soyeon Caren Han

This paper introduces a deep learning model tailored for document information analysis, emphasizing document classification, entity relation extraction, and document visual question answering. The proposed model leverages transformer-based…

Computer Vision and Pattern Recognition · Computer Science 2023-10-26 Tofik Ali , Partha Pratim Roy

We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets…

Computer Vision and Pattern Recognition · Computer Science 2021-01-06 Minesh Mathew , Dimosthenis Karatzas , C. V. Jawahar

Event extraction, the technology that aims to automatically get the structural information from documents, has attracted more and more attention in many fields. Most existing works discuss this issue with the token-level multi-label…

Computation and Language · Computer Science 2022-01-11 Zhuo Xu , Yue Wang , Lu Bai , Lixin Cui

Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-readable representations, enabling downstream applications such as knowledge base construction and retrieval-augmented generation (RAG).…

Procedures are an important knowledge component of documents that can be leveraged by cognitive assistants for automation, question-answering or driving a conversation. It is a challenging problem to parse big dense documents like product…

Artificial Intelligence · Computer Science 2020-10-21 Shivali Agarwal , Shubham Atreja , Vikas Agarwal

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right…

Information Retrieval · Computer Science 2009-12-09 Carlo Abi Chahine , Nathalie Chaignaud , Jean-Philippe Kotowicz , Jean-Pierre Pécuchet

Reading comprehension models are based on recurrent neural networks that sequentially process the document tokens. As interest turns to answering more complex questions over longer documents, sequential reading of large portions of text…

Computation and Language · Computer Science 2018-09-11 Mor Geva , Jonathan Berant
‹ Prev 1 2 3 10 Next ›