English
Related papers

Related papers: ChartCoder: Advancing Multimodal Large Language Mo…

200 papers

Multimodal Large Language Models (MLLMs) have recently demonstrated promising capabilities in multimodal coding tasks such as chart-to-code generation. However, existing methods primarily rely on supervised fine-tuning (SFT), which requires…

Artificial Intelligence · Computer Science 2026-04-03 Zitian Tang , Xu Zhang , Jianbo Yuan , Yang Zou , Varad Gunjal , Songyao Jiang , Davide Modolo

Recently, multimodal large language models (MLLMs) have attracted increasing research attention due to their powerful visual understanding capabilities. While they have achieved impressive results on various vision tasks, their performance…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Chengzhi Xu , Yuyang Wang , Lai Wei , Lichao Sun , Weiran Huang

The emergence of Multi-modal Large Language Models (MLLMs) presents new opportunities for chart understanding. However, due to the fine-grained nature of these tasks, applying MLLMs typically requires large, high-quality datasets for…

Computation and Language · Computer Science 2025-10-08 Yifan Wu , Lutao Yan , Leixian Shen , Yinan Mei , Jiannan Wang , Yuyu Luo

Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To…

Although multimodal large language models (MLLMs) show promise in generating chart rendering code, editing charts via code presents a greater challenge. This task demands MLLMs to integrate chart understanding and reasoning capacities,…

Computation and Language · Computer Science 2025-08-05 Xuanle Zhao , Xuexin Liu , Haoyue Yang , Xianzhen Luo , Fanhu Zeng , Jianling Li , Qi Shi , Chi Chen

We introduce Chart2Code, a new benchmark for evaluating the chart understanding and code generation capabilities of large multimodal models (LMMs). Chart2Code is explicitly designed from a user-driven perspective, capturing diverse…

Software Engineering · Computer Science 2026-04-21 Jiahao Tang , Henry Hengyuan Zhao , Lijian Wu , Zijian Zhang , Yifei Tao , Dongxing Mao , Yang Wan , Jingru Tan , Min Zeng , Min Li , Alex Jinpeng Wang

Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding…

Chart-to-code reconstruction -- the task of recovering executable plotting scripts from chart images -- provides important insights into a model's ability to ground data visualizations in precise, machine-readable form. Yet many existing…

Large Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based…

Computation and Language · Computer Science 2026-04-29 Yuling Shi , Chaoxiang Xie , Zhensu Sun , Yeheng Chen , Chenxu Zhang , Longfei Yun , Chengcheng Wan , Hongyu Zhang , David Lo , Xiaodong Gu

Multimodal Large Language Models (MLLMs) have emerged as powerful tools for chart comprehension. However, they heavily rely on extracted content via OCR, which leads to numerical hallucinations when chart textual annotations are sparse.…

Artificial Intelligence · Computer Science 2025-12-02 Zhengzhuo Xu , SiNan Du , Yiyan Qi , SiwenLu , Chengjin Xu , Chun Yuan , Jian Guo

Large Language Models (LLMs) have demonstrated strong reasoning abilities, making them suitable for complex tasks such as graph computation. Traditional reasoning steps paradigm for graph problems is hindered by unverifiable steps, limited…

Computation and Language · Computer Science 2024-10-28 Qifan Zhang , Xiaobin Hong , Jianheng Tang , Nuo Chen , Yuhan Li , Wenzhong Li , Jing Tang , Jia Li

Chart understanding is a quintessential information fusion task, requiring the seamless integration of graphical and textual data to extract meaning. The advent of Multimodal Large Language Models (MLLMs) has revolutionized this domain, yet…

Computer Vision and Pattern Recognition · Computer Science 2026-02-12 Zhihang Yi , Jian Zhao , Jiancheng Lv , Tao Wang

We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs,…

Software Engineering · Computer Science 2025-03-03 Cheng Yang , Chufan Shi , Yaxin Liu , Bo Shui , Junjie Wang , Mohan Jing , Linran Xu , Xinyu Zhu , Siheng Li , Yuxiang Zhang , Gongye Liu , Xiaomei Nie , Deng Cai , Yujiu Yang

Multi-modal large language models have demonstrated impressive performances on most vision-language tasks. However, the model generally lacks the understanding capabilities for specific domain data, particularly when it comes to…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Yucheng Han , Chi Zhang , Xin Chen , Xu Yang , Zhibin Wang , Gang Yu , Bin Fu , Hanwang Zhang

Recent studies customizing Multimodal Large Language Models (MLLMs) for domain-specific tasks have yielded promising results, especially in the field of scientific chart comprehension. These studies generally utilize visual instruction…

Computer Vision and Pattern Recognition · Computer Science 2025-07-21 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Lu Yuan , Leonid Sigal

Converting user interfaces into code (UI2Code) is a crucial step in website development, which is time-consuming and labor-intensive. The automation of UI2Code is essential to streamline this task, beneficial for improving the development…

Software Engineering · Computer Science 2025-06-13 Fan Wu , Cuiyun Gao , Shuqing Li , Xin-Cheng Wen , Qing Liao

Generating diverse, readable statistical charts from tabular data remains challenging for LLMs, as many failures become apparent after rendering and are not detectable from data or code alone. Existing chart datasets also rarely provide…

Machine Learning · Computer Science 2026-05-04 Pavlin G. Poličar , Andraž Pevcin , Blaž Zupan

Translating chart images into executable plotting scripts-referred to as the chart-to-code generation task-requires Multimodal Large Language Models (MLLMs) to perform fine-grained visual parsing, precise code synthesis, and robust…

Computation and Language · Computer Science 2025-08-21 Zhihan Zhang , Yixin Cao , Lizi Liao

Emerging multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA). Recent efforts primarily focus on scaling up training datasets (i.e., charts, data tables, and question-answer (QA) pairs) through…

Computer Vision and Pattern Recognition · Computer Science 2024-08-13 Xingchen Zeng , Haichuan Lin , Yilin Ye , Wei Zeng

Recent methods for customizing Large Vision Language Models (LVLMs) for domain-specific tasks have shown promising results in scientific chart comprehension. However, existing approaches face two major limitations: First, they rely on…

Computation and Language · Computer Science 2025-07-22 Wan-Cyuan Fan , Yen-Chun Chen , Mengchen Liu , Alexander Jacobson , Lu Yuan , Leonid Sigal
‹ Prev 1 2 3 10 Next ›