English
Related papers

Related papers: TableParser: Automatic Table Parsing with Weak Sup…

200 papers

Translating renderings (e. g. PDFs, scans) into hierarchical document structures is extensively demanded in the daily routines of many real-world applications. However, a holistic, principled approach to inferring the complete hierarchical…

Machine Learning · Computer Science 2021-01-26 Johannes Rausch , Octavio Martinez , Fabian Bissig , Ce Zhang , Stefan Feuerriegel

Table extraction from PDF and image documents is a ubiquitous task in the real-world. Perfect extraction quality is difficult to achieve with one single out-of-box model due to (1) the wide variety of table styles, (2) the lack of training…

Human-Computer Interaction · Computer Science 2021-02-18 Nancy Xin Ru Wang , Douglas Burdick , Yunyao Li

Pool of knowledge available to the mankind depends on the source of learning resources, which can vary from ancient printed documents to present electronic material. The rapid conversion of material available in traditional libraries to…

Computer Vision and Pattern Recognition · Computer Science 2014-12-25 Akmal Jahan Mac , Roshan G Ragel

In this paper we relate a number of parsing algorithms which have been developed in very different areas of parsing theory, and which include deterministic algorithms, tabular algorithms, and a parallel algorithm. We show that these…

cmp-lg · Computer Science 2008-02-03 Mark-Jan Nederhof

The task of table structure recognition aims to recognize the internal structure of a table, which is a key step to make machines understand tables. Currently, there are lots of studies on this task for different file formats such as ASCII…

Information Retrieval · Computer Science 2019-08-29 Zewen Chi , Heyan Huang , Heng-Da Xu , Houjin Yu , Wanxuan Yin , Xian-Ling Mao

Tables are an extremely powerful visual and interactive tool for structuring and manipulating data, making spreadsheet programs one of the most popular computer applications. In this paper we introduce and address the task of recommending…

Information Retrieval · Computer Science 2019-07-26 Shuo Zhang , Krisztian Balog

Infographics are often an integral component of scientific documents for reporting qualitative or quantitative findings as they make it much simpler to comprehend the underlying complex information. However, their interpretation continues…

Computer Vision and Pattern Recognition · Computer Science 2022-11-17 Anukriti Kumar , Tanuja Ganu , Saikat Guha

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including…

Computer Vision and Pattern Recognition · Computer Science 2021-12-03 Marcin Namysl , Alexander M. Esser , Sven Behnke , Joachim Köhler

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table detection and recognition usually…

Computer Vision and Pattern Recognition · Computer Science 2020-07-07 Minghao Li , Lei Cui , Shaohan Huang , Furu Wei , Ming Zhou , Zhoujun Li

Scientific documents contain tables that list important information in a concise fashion. Structure and content extraction from tables embedded within PDF research documents is a very challenging task due to the existence of visual features…

Information Retrieval · Computer Science 2022-11-01 Pratik Kayal , Mrinal Anand , Harsh Desai , Mayank Singh

Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the…

Information Retrieval · Computer Science 2021-06-28 Haoyu Dong , Shijie Liu , Shi Han , Zhouyu Fu , Dongmei Zhang

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Shubham Paliwal , Vishwanath D , Rohit Rahul , Monika Sharma , Lovekesh Vig

Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Ahmed Nassar , Nikolaos Livathinos , Maksym Lysak , Peter Staar

Information representation as tables are compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used, however industry still faces…

Information Retrieval · Computer Science 2020-10-20 Smita Pallavi , Raj Ratn Pranesh , Sumit Kumar

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pre-training frameworks have been proposed following the success of text and images, and they have…

Computation and Language · Computer Science 2022-05-02 Haoyu Dong , Zhoujun Cheng , Xinyi He , Mengyu Zhou , Anda Zhou , Fan Zhou , Ao Liu , Shi Han , Dongmei Zhang

Table structure recognition is necessary for a comprehensive understanding of documents. Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-16 Sachin Raja , Ajoy Mondal , C V Jawahar

Abstract--- Table detection and extraction has been studied in the context of documents like reports, where tables are clearly outlined and stand out from the document structure visually. We study this topic in a rather more challenging…

Information Retrieval · Computer Science 2021-08-20 Martin Holeček , Antonín Hoskovec , Petr Baudiš , Pavel Klinger

In the digital era, table structure recognition technology is a critical tool for processing and analyzing large volumes of tabular data. Previous methods primarily focus on visual aspects of table structure recovery but often fail to…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Zhenrong Zhang , Shuhang Liu , Pengfei Hu , Jiefeng Ma , Jun Du , Jianshu Zhang , Yu Hu

Although Transformers-based architectures excel at processing textual information, their naive adaptation for tabular data often involves flattening the table structure. This simplification can lead to the loss of essential…

Computation and Language · Computer Science 2025-03-04 Raphaël Mouravieff , Benjamin Piwowarski , Sylvain Lamprier

Documents are core carriers of information and knowl-edge, with broad applications in finance, healthcare, and scientific research. Tables, as the main medium for structured data, encapsulate key information and are among the most critical…

Computation and Language · Computer Science 2025-08-15 Xuan Li , Jialiang Dong , Raymond Wong
‹ Prev 1 2 3 10 Next ›