English
Related papers

Related papers: Table understanding in structured documents

200 papers

The automated analysis of administrative documents is an important field in document recognition that is studied for decades. Invoices are key documents among these huge amounts of documents available in companies and public services.…

Information Retrieval · Computer Science 2022-10-11 Thomas Saout , Frédéric Lardeux , Frédéric Saubion

Table structure recognition is necessary for a comprehensive understanding of documents. Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Sachin Raja , Ajoy Mondal , C V Jawahar

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Shubham Paliwal , Vishwanath D , Rohit Rahul , Monika Sharma , Lovekesh Vig

Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Shah Rukh Qasim , Hassan Mahmood , Faisal Shafait

The first phase of table recognition is to detect the tabular area in a document. Subsequently, the tabular structures are recognized in the second phase in order to extract information from the respective cells. Table detection and…

Computer Vision and Pattern Recognition · Computer Science 2021-05-11 Khurram Azeem Hashmi , Marcus Liwicki , Didier Stricker , Muhammad Adnan Afzal , Muhammad Ahtsham Afzal , Muhammad Zeshan Afzal

Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Mrinal Haloi , Shashank Shekhar , Nikhil Fande , Siddhant Swaroop Dash , Sanjay G

Tables are widely used in several types of documents since they can bring important information in a structured way. In scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research…

Computer Vision and Pattern Recognition · Computer Science 2023-02-21 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Extracting information from unstructured text documents is a demanding task, since these documents can have a broad variety of different layouts and a non-trivial reading order, like it is the case for multi-column documents or nested…

Artificial Intelligence · Computer Science 2022-02-08 Matthias Engelbach , Dennis Klau , Jens Drawehn , Maximilien Kintz

Automatic table detection in PDF documents has achieved a great success but tabular data extraction are still challenging due to the integrity and noise issues in detected table areas. The accurate data extraction is extremely crucial in…

Computation and Language · Computer Science 2022-05-24 Siwen Luo , Mengting Wu , Yiwen Gong , Wanying Zhou , Josiah Poon

Table extraction has long been a pervasive problem in financial services. This is more challenging in the image domain, where content is locked behind cumbersome pixel format. Luckily, advances in deep learning for image segmentation, OCR,…

Computer Vision and Pattern Recognition · Computer Science 2024-05-10 William Watson , Bo Liu

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to (1) the wide variety of table styles, (2) the lack of training…

Human-Computer Interaction · Computer Science 2021-02-18 Nancy Xin Ru Wang , Douglas Burdick , Yunyao Li

Information representation as tables are compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used, however industry still faces…

Information Retrieval · Computer Science 2020-10-20 Smita Pallavi , Raj Ratn Pranesh , Sumit Kumar

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including…

Computer Vision and Pattern Recognition · Computer Science 2021-12-03 Marcin Namysl , Alexander M. Esser , Sven Behnke , Joachim Köhler

Tables are everywhere, from scientific journals, papers, websites, and newspapers all the way to items we buy at the supermarket. Detecting them is thus of utmost importance to automatically understanding the content of a document. The…

Computer Vision and Pattern Recognition · Computer Science 2022-11-17 Mahmoud Kasem , Abdelrahman Abdallah , Alexander Berendeyev , Ebrahem Elkady , Mahmoud Abdalla , Mohamed Mahmoud , Mohamed Hamada , Daniyar Nurseitov , Islam Taj-Eddin

This paper presents the design and development of an OCR-powered pipeline for efficient table extraction from invoices. The system leverages Tesseract OCR for text recognition and custom post-processing logic to detect, align, and extract…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Parshva Dhilankumar Patel

Tables present summarized and structured information to the reader, which makes table structure extraction an important part of document understanding applications. However, table structure identification is a hard problem not only because…

Computer Vision and Pattern Recognition · Computer Science 2020-02-07 Saqib Ali Khan , Syed Muhammad Daniyal Khalid , Muhammad Ali Shahzad , Faisal Shafait

Workbook-scale spreadsheet understanding is increasingly important for language-model-based data analysis agents, but remains challenging because relevant information is often distributed across multiple sheets with heterogeneous schemas,…

Artificial Intelligence · Computer Science 2026-05-08 Yiming Lei , Yiqi Wang , Yujia Zhang , Bo Guan , Depei Zhu , Chunhui Wang , Zhuonan Hao , Tianyu Shi

Since real-world ubiquitous documents (e.g., invoices, tickets, resumes and leaflets) contain rich information, automatic document image understanding has become a hot topic. Most existing works decouple the problem into two separate tasks,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-26 Peng Zhang , Yunlu Xu , Zhanzhan Cheng , Shiliang Pu , Jing Lu , Liang Qiao , Yi Niu , Fei Wu

Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Ahmed Nassar , Nikolaos Livathinos , Maksym Lysak , Peter Staar

The extraction and use of diverse knowledge from numerous documents is a pressing challenge in intelligent information retrieval. Documents contain elements that require different recognition methods. Table recognition typically consists of…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Takaya Kawakatsu
‹ Prev 1 2 3 10 Next ›