English
Related papers

Related papers: Aligning benchmark datasets for table structure re…

200 papers

Tabular data in digital documents is widely used to express compact and important information for readers. However, it is challenging to parse tables from unstructured digital documents, such as PDFs and images, into machine-readable format…

Computer Vision and Pattern Recognition · Computer Science 2022-03-09 Bin Xiao , Murat Simsek , Burak Kantarci , Ala Abu Alkheir

Table structure recognition (TSR) holds widespread practical importance by parsing tabular images into structured representations, yet encounters significant challenges when processing complex layouts involving merged or empty cells.…

Computer Vision and Pattern Recognition · Computer Science 2026-04-20 Boming Chen , Zining Wang , Zhentao Guo , Jianqiang Liu , Chen Duan , Yu Gu , Kai zhou , Pengfei Yan

Tables are pervasive in diverse documents, making table recognition (TR) a fundamental task in document analysis. Existing modular TR pipelines separately model table structure and content, leading to suboptimal integration and complex…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Chunxia Qin , Chenyu Liu , Pengcheng Xia , Jun Du , Baocai Yin , Bing Yin , Cong Liu

Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number…

Machine Learning · Computer Science 2025-06-10 Wei Min Loh , Jiaqi Shang , Pascal Poupart

Table structure recognition (TSR) requires both table-level coherence (row/column counts, headers, spanning cells) and precise separator localization. We introduce FastTab, a grid-centric TSR model that avoids autoregressive HTML decoding…

Computer Vision and Pattern Recognition · Computer Science 2026-05-22 Laziz Hamdi , Amine Tamasna , Pascal Boisson , Thierry Paquet

Structured tabular data is a fundamental data type in numerous fields, and the capacity to reason over tables is crucial for answering questions and validating hypotheses. However, constructing labeled data for complex reasoning tasks is…

Computation and Language · Computer Science 2024-06-24 Zhenyu Li , Xiuxing Li , Sunqi Fan , Jianyong Wang

Table Retrieval (TR) has traditionally been formulated as an ad-hoc retrieval problem, where relevance is primarily determined by topical semantic similarity. With the growing adoption of LLM-based agentic systems, access to structured data…

Information Retrieval · Computer Science 2026-05-04 Rihui Jin , Yuchen Lu , Ting Zhang , Jun Wang , Kuicai Dong , Zhaocheng Du , Dongping Liu , Gang Wang , Yong Liu , Guilin Qi

Every data selection method inherently has a target. In practice, these targets often emerge implicitly through benchmark-driven iteration: researchers develop selection strategies, train models, measure benchmark performance, then refine…

Tables convey factual and quantitative data with implicit conventions created by humans that are often challenging for machines to parse. Prior work on table recognition (TR) has mainly centered around complex task-specific combinations of…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 ShengYun Peng , Aishwarya Chakravarthy , Seongmin Lee , Xiaojing Wang , Rajarajeswari Balasubramaniyan , Duen Horng Chau

Despite its real-world significance, model performance on tabular data remains underexplored, leaving uncertainty about which model to rely on and which prompt configuration to adopt. To address this gap, we create ToRR, a benchmark for…

To address the challenges of table structure recognition, we propose a novel Split-Merge-based top-down model optimized for large, densely populated tables. Our approach formulates row and column splitting as sequence labeling tasks,…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Qiyu Hou , Jun Wang

This paper presents the novel approach towards table structure recognition by leveraging the guided anchors. The concept differs from current state-of-the-art approaches for table structure recognition that naively apply object detection…

Computer Vision and Pattern Recognition · Computer Science 2021-04-22 Khurram Azeem Hashmi , Didier Stricker , Marcus Liwicki , Muhammad Noman Afzal , Muhammad Zeshan Afzal

In the realm of neural architecture design, achieving high performance is largely reliant on the manual expertise of researchers. Despite the emergence of Neural Architecture Search (NAS) as a promising technique for automating this…

Machine Learning · Computer Science 2025-01-07 Yannis Y. He

Many organizations rely on data from government and third-party sources, and those sources rarely follow the same data formatting. This introduces challenges in integrating data from multiple sources or aligning external sources with…

Databases · Computer Science 2023-12-27 Arash Dargahi Nobari , Davood Rafiei

Table structure recognition (TSR) aims to convert tabular images into a machine-readable format, where a visual encoder extracts image features and a textual decoder generates table-representing tokens. Existing approaches use classic…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 ShengYun Peng , Seongmin Lee , Xiaojing Wang , Rajarajeswari Balasubramaniyan , Duen Horng Chau

Table Structure Recognition (TSR) requires the logical reasoning ability of large language models (LLMs) to handle complex table layouts, but current datasets are limited in scale and quality, hindering effective use of this reasoning…

Databases · Computer Science 2026-04-16 Ruilin Zhang , Kai Yang

Extracting tables from documents is a critical task across various industries, especially on business documents like invoices and reports. Existing systems based on DEtection TRansformer (DETR) such as TAble TRansformer (TATR), offer…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Eliott Thomas , Mickael Coustaty , Aurelie Joseph , Gaspar Deloin , Elodie Carel , Vincent Poulain D'Andecy , Jean-Marc Ogier

Although Transformers-based architectures excel at processing textual information, their naive adaptation for tabular data often involves flattening the table structure. This simplification can lead to the loss of essential…

Computation and Language · Computer Science 2025-03-04 Raphaël Mouravieff , Benjamin Piwowarski , Sylvain Lamprier

Tabular data are fundamental in common machine learning applications, ranging from finance to genomics and healthcare. This paper focuses on tabular regression tasks, a field where deep learning (DL) methods are not consistently superior to…

Machine Learning · Computer Science 2024-12-17 Hong-Wei Wu , Wei-Yao Wang , Kuang-Da Wang , Wen-Chih Peng

Table annotation is crucial for making web and enterprise tables usable in downstream NLP applications. Unlike textual data where learning semantically rich token or sentence embeddings often suffice, tables are structured combinations of…

Machine Learning · Computer Science 2026-04-22 Ehsan Hoseinzade , Ke Wang , Anandharaju Durai Raju
‹ Prev 1 2 3 10 Next ›