Related papers: GFTE: Graph-based Financial Table Extraction

Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images

Automatic table detection in PDF documents has achieved a great success but tabular data extraction are still challenging due to the integrity and noise issues in detected table areas. The accurate data extraction is extremely crucial in…

Computation and Language · Computer Science 2022-05-24 Siwen Luo , Mengting Wu , Yiwen Gong , Wanying Zhou , Josiah Poon

Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context

Documents are often used for knowledge sharing and preservation in business and science, within which are tables that capture most of the critical data. Unfortunately, most documents are stored and distributed as PDF or scanned images,…

Computer Vision and Pattern Recognition · Computer Science 2020-12-03 Xinyi Zheng , Doug Burdick , Lucian Popa , Xu Zhong , Nancy Xin Ru Wang

Financial Table Extraction in Image Documents

Table extraction has long been a pervasive problem in financial services. This is more challenging in the image domain, where content is locked behind cumbersome pixel format. Luckily, advances in deep learning for image segmentation, OCR,…

Computer Vision and Pattern Recognition · Computer Science 2024-05-10 William Watson , Bo Liu

SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table Extraction

Table extraction from document images is a challenging AI problem, and labelled data for many content domains is difficult to come by. Existing table extraction datasets often focus on scientific tables due to the vast amount of academic…

Machine Learning · Computer Science 2024-12-06 Ethan Bradley , Muhammad Roman , Karen Rafferty , Barry Devereux

Flexible Table Recognition and Semantic Interpretation System

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including…

Computer Vision and Pattern Recognition · Computer Science 2021-12-03 Marcin Namysl , Alexander M. Esser , Sven Behnke , Joachim Köhler

Rethinking Table Recognition using Graph Neural Networks

Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Shah Rukh Qasim , Hassan Mahmood , Faisal Shafait

Image-based table recognition: data, model, and evaluation

Important information that relates to a specific topic in a document is often organized in tabular format to assist readers with information retrieval and comparison, which may be difficult to provide in natural language. However, tabular…

Computer Vision and Pattern Recognition · Computer Science 2020-03-05 Xu Zhong , Elaheh ShafieiBavani , Antonio Jimeno Yepes

TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Shubham Paliwal , Vishwanath D , Rohit Rahul , Monika Sharma , Lovekesh Vig

Interpretable Graph Neural Networks for Tabular Data

Data in tabular format is frequently occurring in real-world applications. Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation…

Machine Learning · Computer Science 2024-08-14 Amr Alkhatib , Sofiane Ennadir , Henrik Boström , Michalis Vazirgiannis

Complicated Table Structure Recognition

The task of table structure recognition aims to recognize the internal structure of a table, which is a key step to make machines understand tables. Currently, there are lots of studies on this task for different file formats such as ASCII…

Information Retrieval · Computer Science 2019-08-29 Zewen Chi , Heyan Huang , Heng-Da Xu , Houjin Yu , Wanxuan Yin , Xian-Ling Mao

TableLab: An Interactive Table Extraction System with Adaptive Deep Learning

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to (1) the wide variety of table styles, (2) the lack of training…

Human-Computer Interaction · Computer Science 2021-02-18 Nancy Xin Ru Wang , Douglas Burdick , Yunyao Li

Extracting Tables from Documents using Conditional Generative Adversarial Networks and Genetic Algorithms

Extracting information from tables in documents presents a significant challenge in many industries and in academic research. Existing methods which take a bottom-up approach of integrating lines into cells and rows or columns neglect the…

Neural and Evolutionary Computing · Computer Science 2019-04-04 Nataliya Le Vine , Matthew Zeigenfuse , Mark Rowan

TabGSL: Graph Structure Learning for Tabular Data Prediction

This work presents a novel approach to tabular data prediction leveraging graph structure learning and graph neural networks. Despite the prevalence of tabular data in real-world applications, traditional deep learning methods often…

Machine Learning · Computer Science 2023-05-26 Jay Chiehen Liao , Cheng-Te Li

CHARTER: heatmap-based multi-type chart data extraction

The digital conversion of information stored in documents is a great source of knowledge. In contrast to the documents text, the conversion of the embedded documents graphics, such as charts and plots, has been much less explored. We…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Joseph Shtok , Sivan Harary , Ophir Azulai , Adi Raz Goldfarb , Assaf Arbelle , Leonid Karlinsky

Graph Neural Networks and Representation Embedding for Table Extraction in PDF Documents

Tables are widely used in several types of documents since they can bring important information in a structured way. In scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research…

Computer Vision and Pattern Recognition · Computer Science 2023-02-21 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks

Tables present summarized and structured information to the reader, which makes table structure extraction an important part of document understanding applications. However, table structure identification is a hard problem not only because…

Computer Vision and Pattern Recognition · Computer Science 2020-02-07 Saqib Ali Khan , Syed Muhammad Daniyal Khalid , Muhammad Ali Shahzad , Faisal Shafait

TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data

Tabular data are ubiquitous for the widespread applications of tables and hence have attracted the attention of researchers to extract underlying information. One of the critical problems in mining tabular data is how to understand their…

Machine Learning · Computer Science 2021-06-17 Lun Du , Fei Gao , Xu Chen , Ran Jia , Junshan Wang , Jiang Zhang , Shi Han , Dongmei Zhang

Integrating and querying similar tables from PDF documents using deep learning

Large amount of public data produced by enterprises are in semi-structured PDF form. Tabular data extraction from reports and other published data in PDF format is of interest for various data consolidation purposes such as analysing and…

Information Retrieval · Computer Science 2019-01-16 Rahul Anand , Hye-Young Paik , Cheng Wang

CTE: A Dataset for Contextualized Table Extraction

Relevant information in documents is often summarized in tables, helping the reader to identify useful facts. Most benchmark datasets support either document layout analysis or table understanding, but lack in providing data to apply both…

Computation and Language · Computer Science 2023-02-14 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Table Detection in the Wild: A Novel Diverse Table Detection Dataset and Method

Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Mrinal Haloi , Shashank Shekhar , Nikhil Fande , Siddhant Swaroop Dash , Sanjay G