English
Related papers

Related papers: Design2Code: Benchmarking Multimodal Code Generati…

200 papers

Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding…

Automatically generating webpage code from webpage designs can significantly reduce the workload of front-end developers, and recent Multimodal Large Language Models (MLLMs) have shown promising potential in this area. However, our…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Yi Gui , Zhen Li , Yao Wan , Yemin Shi , Hongyu Zhang , Yi Su , Bohua Chen , Dongping Chen , Siyuan Wu , Xing Zhou , Wenbin Jiang , Hai Jin , Xiangliang Zhang

Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in automated front-end engineering, e.g., generating UI code from visual designs. However, existing front-end UI code generation benchmarks have the…

Software Engineering · Computer Science 2026-03-17 Jingyu Xiao , Ming Wang , Man Ho Lam , Yuxuan Wan , Junliang Liu , Yintong Huo , Michael R. Lyu

The remarkable progress of Multi-modal Large Language Models (MLLMs) has attracted significant attention due to their superior performance in visual contexts. However, their capabilities in turning visual figure to executable code, have not…

Computation and Language · Computer Science 2024-05-14 Chengyue Wu , Yixiao Ge , Qiushan Guo , Jiahao Wang , Zhixuan Liang , Zeyu Lu , Ying Shan , Ping Luo

Front-end development constitutes a substantial portion of software engineering, yet converting design mockups into production-ready User Interface (UI) code remains tedious and costly. While recent work has explored automating this process…

Software Engineering · Computer Science 2026-04-16 Yi Gui , Jiawan Zhang , Yina Wang , Tianran Ma , Yao Wan , Shilin He , Dongping Chen , Zhou Zhao , Wenbin Jiang , Xuanhua Shi , Hai Jin , Philip S Yu

Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance on the design-to-code task, i.e., generating UI code from UI mock-ups. However, existing benchmarks only contain static web pages for evaluation and ignore…

Software Engineering · Computer Science 2026-03-03 Jingyu Xiao , Yuxuan Wan , Yintong Huo , Zixin Wang , Xinyi Xu , Wenxuan Wang , Zhiyao Xu , Yuhang Wang , Michael R. Lyu

Front-end engineering involves a complex workflow where engineers conceptualize designs, translate them into code, and iteratively refine the implementation. While recent benchmarks primarily focus on converting visual designs to code, we…

Computation and Language · Computer Science 2025-05-27 Haoyu Sun , Huichen Will Wang , Jiawei Gu , Linjie Li , Yu Cheng

With the rapid advancement of Generative AI technology, Multimodal Large Language Models(MLLMs) have the potential to act as AI software engineers capable of executing complex web application development. Considering that the model requires…

Computation and Language · Computer Science 2025-06-10 Zhiyu Lin , Zhengda Zhou , Zhiyuan Zhao , Tianrui Wan , Yilun Ma , Junyu Gao , Xuelong Li

Converting user interfaces into code (UI2Code) is a crucial step in website development, which is time-consuming and labor-intensive. The automation of UI2Code is essential to streamline this task, beneficial for improving the development…

Software Engineering · Computer Science 2025-06-13 Fan Wu , Cuiyun Gao , Shuqing Li , Xin-Cheng Wen , Qing Liao

Large Language Models (LLMs) have made significant strides in front-end code generation. However, existing benchmarks exhibit several critical limitations: many tasks are overly simplistic, test cases often lack rigor, and end-to-end…

Software Engineering · Computer Science 2025-06-19 Hongda Zhu , Yiwen Zhang , Bing Zhao , Jingzhe Ding , Siyao Liu , Tong Liu , Dandan Wang , Yanan Liu , Zhaojian Li

Programming often involves converting detailed and complex specifications into code, a process during which developers typically utilize visual aids to more effectively convey concepts. While recent developments in Large Multimodal Models…

Computation and Language · Computer Science 2024-09-27 Kaixin Li , Yuchen Tian , Qisheng Hu , Ziyang Luo , Zhiyong Huang , Jing Ma

We introduce Chart2Code, a new benchmark for evaluating the chart understanding and code generation capabilities of large multimodal models (LMMs). Chart2Code is explicitly designed from a user-driven perspective, capturing diverse…

Software Engineering · Computer Science 2026-04-21 Jiahao Tang , Henry Hengyuan Zhao , Lijian Wu , Zijian Zhang , Yifei Tao , Dongxing Mao , Yang Wan , Jingru Tan , Min Zeng , Min Li , Alex Jinpeng Wang

The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial advancements in artificial intelligence, significantly enhancing the capability to understand and generate multimodal content. While prior studies have…

Artificial Intelligence · Computer Science 2024-09-30 Lin Li , Guikun Chen , Hanrong Shi , Jun Xiao , Long Chen

Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To…

We present WebMMU, a multilingual benchmark that evaluates three core web tasks: (1) website visual question answering, (2) code editing involving HTML/CSS/JavaScript, and (3) mockup-to-code generation. Unlike prior benchmarks that treat…

A well-executed graphic design typically achieves harmony in two levels, from the fine-grained design elements (color, font and layout) to the overall design. This complexity makes the comprehension of graphic design challenging, for it…

Computer Vision and Pattern Recognition · Computer Science 2024-04-24 Jieru Lin , Danqing Huang , Tiejun Zhao , Dechen Zhan , Chin-Yew Lin

Automating the transformation of user interface (UI) designs into front-end code holds significant promise for accelerating software development and democratizing design workflows. While multimodal large language models (MLLMs) can…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Yilei Jiang , Yaozhi Zheng , Yuxuan Wan , Jiaming Han , Qunzhong Wang , Michael R. Lyu , Xiangyu Yue

Large language models (LLMs) have recently demonstrated strong capabilities in generating machine learning (ML) code, enabling end-to-end pipeline construction from natural language instructions. However, existing benchmarks for ML code…

Multimodal Large Language models (MLLMs) have shown promise in web-related tasks, but evaluating their performance in the web domain remains a challenge due to the lack of comprehensive benchmarks. Existing benchmarks are either designed…

Computation and Language · Computer Science 2024-04-10 Junpeng Liu , Yifan Song , Bill Yuchen Lin , Wai Lam , Graham Neubig , Yuanzhi Li , Xiang Yue

Web applications (web apps) have become a key arena for large language models (LLMs) to demonstrate their code generation capabilities and commercial potential. However, building a benchmark for LLM-generated web apps remains challenging…

Software Engineering · Computer Science 2026-03-17 Chenxu Liu , Yingjie Fu , Wei Yang , Ying Zhang , Tao Xie
‹ Prev 1 2 3 10 Next ›