English
Related papers

Related papers: GFTE: Graph-based Financial Table Extraction

200 papers

Automatic table detection in PDF documents has achieved a great success but tabular data extraction are still challenging due to the integrity and noise issues in detected table areas. The accurate data extraction is extremely crucial in…

Computation and Language · Computer Science 2022-05-24 Siwen Luo , Mengting Wu , Yiwen Gong , Wanying Zhou , Josiah Poon

Documents are often used for knowledge sharing and preservation in business and science, within which are tables that capture most of the critical data. Unfortunately, most documents are stored and distributed as PDF or scanned images,…

Computer Vision and Pattern Recognition · Computer Science 2020-12-03 Xinyi Zheng , Doug Burdick , Lucian Popa , Xu Zhong , Nancy Xin Ru Wang

Table extraction has long been a pervasive problem in financial services. This is more challenging in the image domain, where content is locked behind cumbersome pixel format. Luckily, advances in deep learning for image segmentation, OCR,…

Computer Vision and Pattern Recognition · Computer Science 2024-05-10 William Watson , Bo Liu

Table extraction from document images is a challenging AI problem, and labelled data for many content domains is difficult to come by. Existing table extraction datasets often focus on scientific tables due to the vast amount of academic…

Machine Learning · Computer Science 2024-12-06 Ethan Bradley , Muhammad Roman , Karen Rafferty , Barry Devereux

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including…

Computer Vision and Pattern Recognition · Computer Science 2021-12-03 Marcin Namysl , Alexander M. Esser , Sven Behnke , Joachim Köhler

Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine…

Computer Vision and Pattern Recognition · Computer Science 2019-07-04 Shah Rukh Qasim , Hassan Mahmood , Faisal Shafait

Important information that relates to a specific topic in a document is often organized in tabular format to assist readers with information retrieval and comparison, which may be difficult to provide in natural language. However, tabular…

Computer Vision and Pattern Recognition · Computer Science 2020-03-05 Xu Zhong , Elaheh ShafieiBavani , Antonio Jimeno Yepes

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Shubham Paliwal , Vishwanath D , Rohit Rahul , Monika Sharma , Lovekesh Vig

Data in tabular format is frequently occurring in real-world applications. Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation…

Machine Learning · Computer Science 2024-08-14 Amr Alkhatib , Sofiane Ennadir , Henrik Boström , Michalis Vazirgiannis

The task of table structure recognition aims to recognize the internal structure of a table, which is a key step to make machines understand tables. Currently, there are lots of studies on this task for different file formats such as ASCII…

Information Retrieval · Computer Science 2019-08-29 Zewen Chi , Heyan Huang , Heng-Da Xu , Houjin Yu , Wanxuan Yin , Xian-Ling Mao

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to (1) the wide variety of table styles, (2) the lack of training…

Human-Computer Interaction · Computer Science 2021-02-18 Nancy Xin Ru Wang , Douglas Burdick , Yunyao Li

Extracting information from tables in documents presents a significant challenge in many industries and in academic research. Existing methods which take a bottom-up approach of integrating lines into cells and rows or columns neglect the…

Neural and Evolutionary Computing · Computer Science 2019-04-04 Nataliya Le Vine , Matthew Zeigenfuse , Mark Rowan

This work presents a novel approach to tabular data prediction leveraging graph structure learning and graph neural networks. Despite the prevalence of tabular data in real-world applications, traditional deep learning methods often…

Machine Learning · Computer Science 2023-05-26 Jay Chiehen Liao , Cheng-Te Li

The digital conversion of information stored in documents is a great source of knowledge. In contrast to the documents text, the conversion of the embedded documents graphics, such as charts and plots, has been much less explored. We…

Computer Vision and Pattern Recognition · Computer Science 2021-11-30 Joseph Shtok , Sivan Harary , Ophir Azulai , Adi Raz Goldfarb , Assaf Arbelle , Leonid Karlinsky

Tables are widely used in several types of documents since they can bring important information in a structured way. In scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research…

Computer Vision and Pattern Recognition · Computer Science 2023-02-21 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Tables present summarized and structured information to the reader, which makes table structure extraction an important part of document understanding applications. However, table structure identification is a hard problem not only because…

Computer Vision and Pattern Recognition · Computer Science 2020-02-07 Saqib Ali Khan , Syed Muhammad Daniyal Khalid , Muhammad Ali Shahzad , Faisal Shafait

Tabular data are ubiquitous for the widespread applications of tables and hence have attracted the attention of researchers to extract underlying information. One of the critical problems in mining tabular data is how to understand their…

Machine Learning · Computer Science 2021-06-17 Lun Du , Fei Gao , Xu Chen , Ran Jia , Junshan Wang , Jiang Zhang , Shi Han , Dongmei Zhang

Large amount of public data produced by enterprises are in semi-structured PDF form. Tabular data extraction from reports and other published data in PDF format is of interest for various data consolidation purposes such as analysing and…

Information Retrieval · Computer Science 2019-01-16 Rahul Anand , Hye-Young Paik , Cheng Wang

Relevant information in documents is often summarized in tables, helping the reader to identify useful facts. Most benchmark datasets support either document layout analysis or table understanding, but lack in providing data to apply both…

Computation and Language · Computer Science 2023-02-14 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Mrinal Haloi , Shashank Shekhar , Nikhil Fande , Siddhant Swaroop Dash , Sanjay G
‹ Prev 1 2 3 10 Next ›