Related papers: TableParser: Automatic Table Parsing with Weak Sup…

DocParser: Hierarchical Structure Parsing of Document Renderings

Translating renderings (e. g. PDFs, scans) into hierarchical document structures is extensively demanded in the daily routines of many real-world applications. However, a holistic, principled approach to inferring the complete hierarchical…

Machine Learning · Computer Science 2021-01-26 Johannes Rausch , Octavio Martinez , Fabian Bissig , Ce Zhang , Stefan Feuerriegel

TableLab: An Interactive Table Extraction System with Adaptive Deep Learning

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to (1) the wide variety of table styles, (2) the lack of training…

Human-Computer Interaction · Computer Science 2021-02-18 Nancy Xin Ru Wang , Douglas Burdick , Yunyao Li

Locating Tables in Scanned Documents for Reconstructing and Republishing (ICIAfS14)

Pool of knowledge available to the mankind depends on the source of learning resources, which can vary from ancient printed documents to present electronic material. The rapid conversion of material available in traditional libraries to…

Computer Vision and Pattern Recognition · Computer Science 2014-12-25 Akmal Jahan Mac , Roshan G Ragel

An Optimal Tabular Parsing Algorithm

In this paper we relate a number of parsing algorithms which have been developed in very different areas of parsing theory, and which include deterministic algorithms, tabular algorithms, and a parallel algorithm. We show that these…

cmp-lg · Computer Science 2008-02-03 Mark-Jan Nederhof

Complicated Table Structure Recognition

The task of table structure recognition aims to recognize the internal structure of a table, which is a key step to make machines understand tables. Currently, there are lots of studies on this task for different file formats such as ASCII…

Information Retrieval · Computer Science 2019-08-29 Zewen Chi , Heyan Huang , Heng-Da Xu , Houjin Yu , Wanxuan Yin , Xian-Ling Mao

Recommending Related Tables

Tables are an extremely powerful visual and interactive tool for structuring and manipulating data, making spreadsheet programs one of the most popular computer applications. In this paper we introduce and address the task of recommending…

Information Retrieval · Computer Science 2019-07-26 Shuo Zhang , Krisztian Balog

ChartParser: Automatic Chart Parsing for Print-Impaired

Infographics are often an integral component of scientific documents for reporting qualitative or quantitative findings as they make it much simpler to comprehend the underlying complex information. However, their interpretation continues…

Computer Vision and Pattern Recognition · Computer Science 2022-11-17 Anukriti Kumar , Tanuja Ganu , Saikat Guha

Flexible Table Recognition and Semantic Interpretation System

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including…

Computer Vision and Pattern Recognition · Computer Science 2021-12-03 Marcin Namysl , Alexander M. Esser , Sven Behnke , Joachim Köhler

TableBank: A Benchmark Dataset for Table Detection and Recognition

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table detection and recognition usually…

Computer Vision and Pattern Recognition · Computer Science 2020-07-07 Minghao Li , Lei Cui , Shaohan Huang , Furu Wei , Ming Zhou , Zhoujun Li

Tables to LaTeX: structure and content extraction from scientific tables

Scientific documents contain tables that list important information in a concise fashion. Structure and content extraction from tables embedded within PDF research documents is a very challenging task due to the existence of visual features…

Information Retrieval · Computer Science 2022-11-01 Pratik Kayal , Mrinal Anand , Harsh Desai , Mayank Singh

TableSense: Spreadsheet Table Detection with Convolutional Neural Networks

Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the…

Information Retrieval · Computer Science 2021-06-28 Haoyu Dong , Shijie Liu , Shi Han , Zhouyu Fu , Dongmei Zhang

TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Shubham Paliwal , Vishwanath D , Rohit Rahul , Monika Sharma , Lovekesh Vig

TableFormer: Table Structure Understanding with Transformers

Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Ahmed Nassar , Nikolaos Livathinos , Maksym Lysak , Peter Staar

A Conglomerate of Multiple OCR Table Detection and Extraction

Information representation as tables are compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used, however industry still faces…

Information Retrieval · Computer Science 2020-10-20 Smita Pallavi , Raj Ratn Pranesh , Sumit Kumar

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pre-training frameworks have been proposed following the success of text and images, and they have…

Computation and Language · Computer Science 2022-05-02 Haoyu Dong , Zhoujun Cheng , Xinyi He , Mengyu Zhou , Anda Zhou , Fan Zhou , Ao Liu , Shi Han , Dongmei Zhang

Visual Understanding of Complex Table Structures from Document Images

Table structure recognition is necessary for a comprehensive understanding of documents. Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Sachin Raja , Ajoy Mondal , C V Jawahar

Table understanding in structured documents

Abstract--- Table detection and extraction has been studied in the context of documents like reports, where tables are clearly outlined and stand out from the document structure visually. We study this topic in a rather more challenging…

Information Retrieval · Computer Science 2021-08-20 Martin Holeček , Antonín Hoskovec , Petr Baudiš , Pavel Klinger

UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition

In the digital era, table structure recognition technology is a critical tool for processing and analyzing large volumes of tabular data. Previous methods primarily focus on visual aspects of table structure recovery but often fail to…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Zhenrong Zhang , Shuhang Liu , Pengfei Hu , Jiefeng Ma , Jun Du , Jianshu Zhang , Yu Hu

Structural Deep Encoding for Table Question Answering

Although Transformers-based architectures excel at processing textual information, their naive adaptation for tabular data often involves flattening the table structure. This simplification can lead to the loss of essential…

Computation and Language · Computer Science 2025-03-04 Raphaël Mouravieff , Benjamin Piwowarski , Sylvain Lamprier

From Surface to Semantics: Semantic Structure Parsing for Table-Centric Document Analysis

Documents are core carriers of information and knowl-edge, with broad applications in finance, healthcare, and scientific research. Tables, as the main medium for structured data, encapsulate key information and are among the most critical…

Computation and Language · Computer Science 2025-08-15 Xuan Li , Jialiang Dong , Raymond Wong