English
Related papers

Related papers: ChartCards: A Chart-Metadata Generation Framework …

200 papers

Multi-modal large language models have demonstrated impressive performances on most vision-language tasks. However, the model generally lacks the understanding capabilities for specific domain data, particularly when it comes to…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Yucheng Han , Chi Zhang , Xin Chen , Xu Yang , Zhibin Wang , Gang Yu , Bin Fu , Hanwang Zhang

Chart question answering (ChartQA) tasks play a critical role in interpreting and extracting insights from visualization charts. While recent advancements in multimodal large language models (MLLMs) like GPT-4o have shown promise in…

Computation and Language · Computer Science 2024-11-07 Yifan Wu , Lutao Yan , Leixian Shen , Yunhai Wang , Nan Tang , Yuyu Luo

Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in chart understanding tasks. However, interpreting charts with textual descriptions often leads to information loss, as it fails to fully capture the dense…

Artificial Intelligence · Computer Science 2025-07-03 Xuanle Zhao , Xianzhen Luo , Qi Shi , Chi Chen , Shuo Wang , Zhiyuan Liu , Maosong Sun

Being able to effectively read scientific plots, or chart understanding, is a central part toward building effective agents for science. However, existing multimodal large language models (MLLMs), especially open-source ones, are still…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Yuwei Yang , Zeyu Zhang , Yunzhong Hou , Zhuowan Li , Gaowen Liu , Ali Payani , Yuan-Sen Ting , Liang Zheng

Chart understanding is a quintessential information fusion task, requiring the seamless integration of graphical and textual data to extract meaning. The advent of Multimodal Large Language Models (MLLMs) has revolutionized this domain, yet…

Computer Vision and Pattern Recognition · Computer Science 2026-02-12 Zhihang Yi , Jian Zhao , Jiancheng Lv , Tao Wang

Charts provide visual representations of data and are widely used for analyzing information, addressing queries, and conveying insights to others. Various chart-related downstream tasks have emerged recently, such as question-answering and…

Computation and Language · Computer Science 2024-03-15 Ahmed Masry , Mehrad Shahmohammadi , Md Rizwan Parvez , Enamul Hoque , Shafiq Joty

Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the…

Computation and Language · Computer Science 2025-02-11 Zifeng Zhu , Mengzhao Jia , Zhihan Zhang , Lang Li , Meng Jiang

With the rapid development of large language models (LLMs) and their integration into large multimodal models (LMMs), there has been impressive progress in zero-shot completion of user-oriented vision-language tasks. However, a gap remains…

Computation and Language · Computer Science 2024-04-16 Fuxiao Liu , Xiaoyang Wang , Wenlin Yao , Jianshu Chen , Kaiqiang Song , Sangwoo Cho , Yaser Yacoob , Dong Yu

Emerging multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA). Recent efforts primarily focus on scaling up training datasets (i.e., charts, data tables, and question-answer (QA) pairs) through…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Xingchen Zeng , Haichuan Lin , Yilin Ye , Wei Zeng

Generating diverse, readable statistical charts from tabular data remains challenging for LLMs, as many failures become apparent after rendering and are not detectable from data or code alone. Existing chart datasets also rarely provide…

Machine Learning · Computer Science 2026-05-04 Pavlin G. Poličar , Andraž Pevcin , Blaž Zupan

Although multimodal large language models (MLLMs) show promise in generating chart rendering code, editing charts via code presents a greater challenge. This task demands MLLMs to integrate chart understanding and reasoning capacities,…

Computation and Language · Computer Science 2025-08-05 Xuanle Zhao , Xuexin Liu , Haoyue Yang , Xianzhen Luo , Fanhu Zeng , Jianling Li , Qi Shi , Chi Chen

Recently, many versatile Multi-modal Large Language Models (MLLMs) have emerged continuously. However, their capacity to query information depicted in visual charts and engage in reasoning based on the queried contents remains…

Computer Vision and Pattern Recognition · Computer Science 2025-04-29 Renqiu Xia , Bo Zhang , Hancheng Ye , Xiangchao Yan , Qi Liu , Hongbin Zhou , Zijun Chen , Peng Ye , Min Dou , Botian Shi , Junchi Yan , Yu Qiao

Multimodal Large Language Models (MLLMs) have shown impressive capabilities in image understanding and generation. However, current benchmarks fail to accurately evaluate the chart comprehension of MLLMs due to limited chart types and…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Zhengzhuo Xu , Sinan Du , Yiyan Qi , Chengjin Xu , Chun Yuan , Jian Guo

With advancements in deep learning (DL) and computer vision techniques, the field of chart understanding is evolving rapidly. In particular, multimodal large language models (MLLMs) are proving to be efficient and accurate in understanding…

Artificial Intelligence · Computer Science 2026-01-21 Ahmad Mustapha , Charbel Toumieh , Mariette Awad

Recent studies customizing Multimodal Large Language Models (MLLMs) for domain-specific tasks have yielded promising results, especially in the field of scientific chart comprehension. These studies generally utilize visual instruction…

Computer Vision and Pattern Recognition · Computer Science 2025-07-21 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Lu Yuan , Leonid Sigal

Recent methods for customizing Large Vision Language Models (LVLMs) for domain-specific tasks have shown promising results in scientific chart comprehension. However, existing approaches face two major limitations: First, they rely on…

Computation and Language · Computer Science 2025-07-22 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Alexander Jacobson , Lu Yuan , Leonid Sigal

Charts are widely used to present complex information. Deriving meaningful insights in real-world contexts often requires interpreting multiple related charts together. Research on understanding multi-chart images has not been extensively…

Computation and Language · Computer Science 2026-04-24 Azher Ahmed Efat , Seok Hwan Song , Wallapak Tavanapong

Chart summarization is a crucial task for blind and visually impaired individuals as it is their primary means of accessing and interpreting graphical data. Crafting high-quality descriptions is challenging because it requires precise…

Computer Vision and Pattern Recognition · Computer Science 2026-03-31 Omar Moured , Jiaming Zhang , M. Saquib Sarfraz , Rainer Stiefelhagen

Complex chart understanding tasks demand advanced visual recognition and reasoning capabilities from multimodal large language models (MLLMs). However, current research provides limited coverage of complex chart scenarios and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-05 Duo Xu , Hao Cheng , Xin Lin , Zhen Xie , Hao Wang

We introduce Chart2Code, a new benchmark for evaluating the chart understanding and code generation capabilities of large multimodal models (LMMs). Chart2Code is explicitly designed from a user-driven perspective, capturing diverse…

Software Engineering · Computer Science 2026-04-21 Jiahao Tang , Henry Hengyuan Zhao , Lijian Wu , Zijian Zhang , Yifei Tao , Dongxing Mao , Yang Wan , Jingru Tan , Min Zeng , Min Li , Alex Jinpeng Wang
‹ Prev 1 2 3 10 Next ›