Related papers: ChartCoder: Advancing Multimodal Large Language Mo…

MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction

Multimodal Large Language Models (MLLMs) have recently demonstrated promising capabilities in multimodal coding tasks such as chart-to-code generation. However, existing methods primarily rely on supervised fine-tuning (SFT), which requires…

Artificial Intelligence · Computer Science 2026-04-03 Zitian Tang , Xu Zhang , Jianbo Yuan , Yang Zou , Varad Gunjal , Songyao Jiang , Davide Modolo

Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction

Recently, multimodal large language models (MLLMs) have attracted increasing research attention due to their powerful visual understanding capabilities. While they have achieved impressive results on various vision tasks, their performance…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Chengzhi Xu , Yuyang Wang , Lai Wei , Lichao Sun , Weiran Huang

ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart Understanding

The emergence of Multi-modal Large Language Models (MLLMs) presents new opportunities for chart understanding. However, due to the fine-grained nature of these tasks, applying MLLMs typically requires large, high-quality datasets for…

Computation and Language · Computer Science 2025-10-08 Yifan Wu , Lutao Yan , Leixian Shen , Yinan Mei , Jiannan Wang , Yuyu Luo

RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To…

Computation and Language · Computer Science 2026-03-30 Jiajun Zhang , Yuying Li , Zhixun Li , Xingyu Guo , Jingzhuo Wu , Leqi Zheng , Yiran Yang , Jianke Zhang , Qingbin Li , Shannan Yan , Zhetong Li , Changguo Jia , Junfei Wu , Zilei Wang , Qiang Liu , Liang Wang

ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing

Although multimodal large language models (MLLMs) show promise in generating chart rendering code, editing charts via code presents a greater challenge. This task demands MLLMs to integrate chart understanding and reasoning capacities,…

Computation and Language · Computer Science 2025-08-05 Xuanle Zhao , Xuexin Liu , Haoyue Yang , Xianzhen Luo , Fanhu Zeng , Jianling Li , Qi Shi , Chi Chen

From Charts to Code: A Hierarchical Benchmark for Multimodal Models

We introduce Chart2Code, a new benchmark for evaluating the chart understanding and code generation capabilities of large multimodal models (LMMs). Chart2Code is explicitly designed from a user-driven perspective, capturing diverse…

Software Engineering · Computer Science 2026-04-21 Jiahao Tang , Henry Hengyuan Zhao , Lijian Wu , Zijian Zhang , Yifei Tao , Dongxing Mao , Yang Wan , Jingru Tan , Min Zeng , Min Li , Alex Jinpeng Wang

Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Sukmin Yun , Haokun Lin , Rusiru Thushara , Mohammad Qazim Bhat , Yongxin Wang , Zutao Jiang , Mingkai Deng , Jinhong Wang , Tianhua Tao , Junbo Li , Haonan Li , Preslav Nakov , Timothy Baldwin , Zhengzhong Liu , Eric P. Xing , Xiaodan Liang , Zhiqiang Shen

ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation

Chart-to-code reconstruction -- the task of recovering executable plotting scripts from chart images -- provides important insights into a model's ability to ground data visualizations in precise, machine-readable form. Yet many existing…

Human-Computer Interaction · Computer Science 2025-07-29 Jovana Kondic , Pengyuan Li , Dhiraj Joshi , Zexue He , Shafiq Abedin , Jennifer Sun , Ben Wiesel , Eli Schwartz , Ahmed Nassar , Bo Wu , Assaf Arbelle , Aude Oliva , Dan Gutfreund , Leonid Karlinsky , Rogerio Feris

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Large Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based…

Computation and Language · Computer Science 2026-04-29 Yuling Shi , Chaoxiang Xie , Zhensu Sun , Yeheng Chen , Chenxu Zhang , Longfei Yun , Chengcheng Wan , Hongyu Zhang , David Lo , Xiaodong Gu

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning

Multimodal Large Language Models (MLLMs) have emerged as powerful tools for chart comprehension. However, they heavily rely on extracted content via OCR, which leads to numerical hallucinations when chart textual annotations are sparse.…

Artificial Intelligence · Computer Science 2025-12-02 Zhengzhuo Xu , SiNan Du , Yiyan Qi , SiwenLu , Chengjin Xu , Chun Yuan , Jian Guo

GCoder: Improving Large Language Model for Generalized Graph Problem Solving

Large Language Models (LLMs) have demonstrated strong reasoning abilities, making them suitable for complex tasks such as graph computation. Traditional reasoning steps paradigm for graph problems is hindered by unverifiable steps, limited…

Computation and Language · Computer Science 2024-10-28 Qifan Zhang , Xiaobin Hong , Jianheng Tang , Nuo Chen , Yuhan Li , Wenzhong Li , Jing Tang , Jia Li

Multimodal Information Fusion for Chart Understanding: A Survey of MLLMs -- Evolution, Limitations, and Cognitive Enhancement

Chart understanding is a quintessential information fusion task, requiring the seamless integration of graphical and textual data to extract meaning. The advent of Multimodal Large Language Models (MLLMs) has revolutionized this domain, yet…

Computer Vision and Pattern Recognition · Computer Science 2026-02-12 Zhihang Yi , Jian Zhao , Jiancheng Lv , Tao Wang

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs,…

Software Engineering · Computer Science 2025-03-03 Cheng Yang , Chufan Shi , Yaxin Liu , Bo Shui , Junjie Wang , Mohan Jing , Linran Xu , Xinyu Zhu , Siheng Li , Yuxiang Zhang , Gongye Liu , Xiaomei Nie , Deng Cai , Yujiu Yang

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

Multi-modal large language models have demonstrated impressive performances on most vision-language tasks. However, the model generally lacks the understanding capabilities for specific domain data, particularly when it comes to…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Yucheng Han , Chi Zhang , Xin Chen , Xu Yang , Zhibin Wang , Gang Yu , Bin Fu , Hanwang Zhang

On Pre-training of Multimodal Language Models Customized for Chart Understanding

Recent studies customizing Multimodal Large Language Models (MLLMs) for domain-specific tasks have yielded promising results, especially in the field of scientific chart comprehension. These studies generally utilize visual instruction…

Computer Vision and Pattern Recognition · Computer Science 2025-07-21 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Lu Yuan , Leonid Sigal

MLLM-Based UI2Code Automation Guided by UI Layout Information

Converting user interfaces into code (UI2Code) is a crucial step in website development, which is time-consuming and labor-intensive. The automation of UI2Code is essential to streamline this task, beneficial for improving the development…

Software Engineering · Computer Science 2025-06-13 Fan Wu , Cuiyun Gao , Shuqing Li , Xin-Cheng Wen , Qing Liao

Generating Statistical Charts with Validation-Driven LLM Workflows

Generating diverse, readable statistical charts from tabular data remains challenging for LLMs, as many failures become apparent after rendering and are not detectable from data or code alone. Existing chart datasets also rarely provide…

Machine Learning · Computer Science 2026-05-04 Pavlin G. Poličar , Andraž Pevcin , Blaž Zupan

Boosting Chart-to-Code Generation in MLLM via Dual Preference-Guided Refinement

Translating chart images into executable plotting scripts-referred to as the chart-to-code generation task-requires Multimodal Large Language Models (MLLMs) to perform fine-grained visual parsing, precise code synthesis, and robust…

Computation and Language · Computer Science 2025-08-21 Zhihan Zhang , Yixin Cao , Lizi Liao

Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning

Emerging multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA). Recent efforts primarily focus on scaling up training datasets (i.e., charts, data tables, and question-answer (QA) pairs) through…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Xingchen Zeng , Haichuan Lin , Yilin Ye , Wei Zeng

In-Depth and In-Breadth: Pre-training Multimodal Language Models Customized for Comprehensive Chart Understanding

Recent methods for customizing Large Vision Language Models (LVLMs) for domain-specific tasks have shown promising results in scientific chart comprehension. However, existing approaches face two major limitations: First, they rely on…

Computation and Language · Computer Science 2025-07-22 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Alexander Jacobson , Lu Yuan , Leonid Sigal