Related papers: CHARTER: heatmap-based multi-type chart data extra…

Towards an efficient framework for Data Extraction from Chart Images

In this paper, we fill the research gap by adopting state-of-the-art computer vision techniques for the data extraction stage in a data mining system. As shown in Fig.1, this stage contains two subtasks, namely, plot element detection and…

Computer Vision and Pattern Recognition · Computer Science 2021-05-06 Weihong Ma , Hesuo Zhang , Shuang Yan , Guangshun Yao , Yichao Huang , Hui Li , Yaqiang Wu , Lianwen Jin

Tensor Fields for Data Extraction from Chart Images: Bar Charts and Scatter Plots

Charts are an essential part of both graphicacy (graphical literacy), and statistical literacy. As chart understanding has become increasingly relevant in data science, automating chart analysis by processing raster images of the charts has…

Computer Vision and Pattern Recognition · Computer Science 2020-10-07 Jaya Sreevalsan-Nair , Komal Dadhich , Siri Chandana Daggubati

Automatic Identification and Data Extraction from 2-Dimensional Plots in Digital Documents

Most search engines index the textual content of documents in digital libraries. However, scholarly articles frequently report important findings in figures for visual impact and the contents of these figures are not indexed. These contents…

Computer Vision and Pattern Recognition · Computer Science 2008-09-11 William Brouwer , Saurabh Kataria , Sujatha Das , Prasenjit Mitra , C. L. Giles

Scatteract: Automated extraction of data from scatter plots

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical…

Computer Vision and Pattern Recognition · Computer Science 2018-10-10 Mathieu Cliche , David Rosenberg , Dhruv Madeka , Connie Yee

VectorGraphNET: Graph Attention Networks for Accurate Segmentation of Complex Technical Drawings

This paper introduces a new approach to extract and analyze vector data from technical drawings in PDF format. Our method involves converting PDF files into SVG format and creating a feature-rich graph representation, which captures the…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Andrea Carrara , Stavros Nousias , André Borrmann

ChartDETR: A Multi-shape Detection Network for Visual Chart Recognition

Visual chart recognition systems are gaining increasing attention due to the growing demand for automatically identifying table headers and values from chart images. Current methods rely on keypoint detection to estimate data element shapes…

Computer Vision and Pattern Recognition · Computer Science 2023-08-16 Wenyuan Xue , Dapeng Chen , Baosheng Yu , Yifei Chen , Sai Zhou , Wei Peng

Chart-Text: A Fully Automated Chart Image Descriptor

Images greatly help in understanding, interpreting and visualizing data. Adding textual description to images is the first and foremost principle of web accessibility. Visually impaired users using screen readers will use these textual…

Computer Vision and Pattern Recognition · Computer Science 2018-12-31 Abhijit Balaji , Thuvaarakkesh Ramanathan , Venkateshwarlu Sonathi

ChartEye: A Deep Learning Framework for Chart Information Extraction

The widespread use of charts and infographics as a means of data visualization in various domains has inspired recent research in automated chart understanding. However, information extraction from chart images is a complex multitasked…

Computer Vision and Pattern Recognition · Computer Science 2024-08-30 Osama Mustafa , Muhammad Khizer Ali , Momina Moetesum , Imran Siddiqi

Scientific Dataset Discovery via Topic-level Recommendation

Data intensive research requires the support of appropriate datasets. However, it is often time-consuming to discover usable datasets matching a specific research topic. We formulate the dataset discovery problem on an attributed…

Information Retrieval · Computer Science 2021-06-08 Basmah Altaf , Shichao Pei , Xiangliang Zhang

Graph Neural Networks and Representation Embedding for Table Extraction in PDF Documents

Tables are widely used in several types of documents since they can bring important information in a structured way. In scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research…

Computer Vision and Pattern Recognition · Computer Science 2023-02-21 Andrea Gemelli , Emanuele Vivoli , Simone Marinai

Timestamping Documents and Beliefs

Most of the textual information available to us are temporally variable. In a world where information is dynamic, time-stamping them is a very important task. Documents are a good source of information and are used for many tasks like,…

Computation and Language · Computer Science 2021-06-29 Swayambhu Nath Ray

GFTE: Graph-based Financial Table Extraction

Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields tables are often disclosed in…

Computer Vision and Pattern Recognition · Computer Science 2020-03-18 Yiren Li , Zheng Huang , Junchi Yan , Yi Zhou , Fan Ye , Xianhui Liu

Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model

Information visualizations such as bar charts and line charts are very popular for exploring data and communicating insights. Interpreting and making sense of such visualizations can be challenging for some people, such as those who are…

Computation and Language · Computer Science 2020-12-01 Jason Obeid , Enamul Hoque

ChartKG: A Knowledge-Graph-Based Representation for Chart Images

Chart images, such as bar charts, pie charts, and line charts, are explosively produced due to the wide usage of data visualizations. Accordingly, knowledge mining from chart images is becoming increasingly important, which can benefit…

Artificial Intelligence · Computer Science 2024-10-15 Zhiguang Zhou , Haoxuan Wang , Zhengqing Zhao , Fengling Zheng , Yongheng Wang , Wei Chen , Yong Wang

A Survey and Approach to Chart Classification

Charts represent an essential source of visual information in documents and facilitate a deep understanding and interpretation of information typically conveyed numerically. In the scientific literature, there are many charts, each with its…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Anurag Dhote , Mohammed Javed , David S Doermann

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an…

Computer Vision and Pattern Recognition · Computer Science 2023-04-06 Zhi-Qi Cheng , Qi Dai , Siyao Li , Jingdong Sun , Teruko Mitamura , Alexander G. Hauptmann

A Conglomerate of Multiple OCR Table Detection and Extraction

Information representation as tables are compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used, however industry still faces…

Information Retrieval · Computer Science 2020-10-20 Smita Pallavi , Raj Ratn Pranesh , Sumit Kumar

GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation

Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements. Large and…

Computer Vision and Pattern Recognition · Computer Science 2024-02-21 Ayan Banerjee , Sanket Biswas , Josep Lladós , Umapada Pal

Tables to LaTeX: structure and content extraction from scientific tables

Scientific documents contain tables that list important information in a concise fashion. Structure and content extraction from tables embedded within PDF research documents is a very challenging task due to the existence of visual features…

Information Retrieval · Computer Science 2022-11-01 Pratik Kayal , Mrinal Anand , Harsh Desai , Mayank Singh

GenPlot: Increasing the Scale and Diversity of Chart Derendering Data

Vertical bars, horizontal bars, dot, scatter, and line plots provide a diverse set of visualizations to represent data. To understand these plots, one must be able to recognize textual components, locate data points in a plot, and process…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Brendan Artley