English
Related papers

Related papers: VoxelCodeBench: Benchmarking 3D World Modeling Thr…

200 papers

AI-assisted coding has rapidly reshaped software practice and research workflows, yet today's models still struggle to produce correct code for complex 3D geometric vision. If models could reliably write such code, the research of our…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Wenyi Li , Renkai Luo , Yue Yu , Huan-ang Gao , Mingju Gao , Li Yuan , Chaoyou Fu , Hao Zhao

Recent progress in generative video models, such as Veo-3, has shown surprising zero-shot reasoning abilities, creating a growing need for systematic and reliable evaluation. We introduce V-ReasonBench, a benchmark designed to assess video…

Computer Vision and Pattern Recognition · Computer Science 2025-11-21 Yang Luo , Xuanlei Zhao , Baijiong Lin , Lingting Zhu , Liyao Tang , Yuqi Liu , Ying-Cong Chen , Shengju Qian , Xin Wang , Yang You

Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation.…

Geometric problem solving constitutes a critical branch of mathematical reasoning, requiring precise analysis of shapes and spatial relationships. Current evaluations of geometric reasoning in vision-language models (VLMs) face limitations,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Yuan Feng , Yue Yang , Xiaohan He , Jiatong Zhao , Jianlong Chen , Zijun Chen , Daocheng Fu , Qi Liu , Renqiu Xia , Bo Zhang , Junchi Yan

Recent advances in vision-language models (VLMs) have expanded their multimodal code generation capabilities, yet their ability to generate executable visualization code from plots, especially for complex 3D, animated, plot-to-plot…

Human-Computer Interaction · Computer Science 2026-01-21 Yi Zhao , Zhen Yang , Shuaiqi Duan , Wenmeng Yu , Zhe Su , Jibing Gong , Jie Tang

Code has emerged as a precise and executable medium for reasoning and action in the agent era. Yet, progress has largely focused on language-centric tasks such as program synthesis and debugging, leaving visual-centric coding underexplored.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-05 Kevin Qinghong Lin , Yuhao Zheng , Hangyu Ran , Dantong Zhu , Dongxing Mao , Linjie Li , Philip Torr , Alex Jinpeng Wang

Text rendering has recently emerged as one of the most challenging frontiers in visual generation, drawing significant attention from large-scale diffusion and multimodal models. However, text editing within images remains largely…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Rui Gui , Yang Wan , Haochen Han , Dongxing Mao , Fangming Liu , Min Li , Alex Jinpeng Wang

To adequately test modern code generation systems, evaluation benchmarks must execute and test the code generated by the system. However, these execution and testing requirements have largely limited benchmarks to settings where code is…

Software Engineering · Computer Science 2024-10-04 Yiqing Xie , Alex Xie , Divyanshu Sheth , Pengfei Liu , Daniel Fried , Carolyn Rose

Image-to-code generation tests whether a vision-language model (VLM) can recover the structure of an image enough to express it as executable code. Existing benchmarks either focus on narrow visual domains, depend on paired executable…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Ajay Vikram Periasami , Junlin Wang , Bhuwan Dhingra

Web applications (web apps) have become a key arena for large language models (LLMs) to demonstrate their code generation capabilities and commercial potential. However, building a benchmark for LLM-generated web apps remains challenging…

Software Engineering · Computer Science 2026-03-17 Chenxu Liu , Yingjie Fu , Wei Yang , Ying Zhang , Tao Xie

3D scene understanding has been transformed by open-vocabulary language models that enable interaction via natural language. However, at present the evaluation of these representations is limited to datasets with closed-set semantics that…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Christina Kassab , Sacha Morin , Martin Büchner , Matías Mattamala , Kumaraditya Gupta , Abhinav Valada , Liam Paull , Maurice Fallon

Geometric spatial reasoning forms the foundation of many applications in artificial intelligence, yet the ability of large language models (LLMs) to operate over geometric spatial information expressed in procedural code remains…

Artificial Intelligence · Computer Science 2026-02-11 Shixian Luo , Zezhou Zhu , Yu Yuan , Yuncheng Yang , Lianlei Shan , Yong Wu

With the growing demand for spatiotemporal data processing and geospatial modeling, automating geospatial code generation has become essential for productivity. Large language models (LLMs) show promise in code generation but face…

Software Engineering · Computer Science 2024-10-21 Shuyang Hou , Zhangxiao Shen , Jianyuan Liang , Anqi Zhao , Zhipeng Gui , Rui Li , Huayi Wu

The task of crafting procedural programs capable of generating structurally valid 3D shapes easily and intuitively remains an elusive goal in computer vision and graphics. Within the graphics community, generating procedural 3D models has…

Graphics · Computer Science 2025-03-21 Ofek Pearl , Itai Lang , Yuhua Hu , Raymond A. Yeh , Rana Hanocka

Multimodal Large Language Models (MLLMs) struggle with precise reasoning for structured visuals like charts and diagrams, as pixel-based perception lacks a mechanism for verification. To address this, we propose to leverage derendering --…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Junhong Shen , Mu Cai , Bo Hu , Ameet Talwalkar , David A Ross , Cordelia Schmid , Alireza Fathi

Code generation has emerged as one of AI's highest-impact use cases, yet existing benchmarks measure isolated tasks rather than the complete "zero-to-one" process of building a working application from scratch. We introduce Vibe Code Bench,…

Software Engineering · Computer Science 2026-05-15 Hung Tran , Langston Nashold , Rayan Krishnan , Antoine Bigeard , Alex Gu

We introduce GeoBuildBench, a benchmark designed to evaluate whether large language models and multimodal agents can ground informal natural-language plane geometry problems into executable geometric constructions. Unlike existing geometry…

Computation and Language · Computer Science 2026-05-14 Jinwoong Kim , Rui Yang , Huishuai Zhang

Multimodal large language models (MLLMs) have significantly advanced the integration of visual and textual understanding. However, their ability to generate code from multimodal inputs remains limited. In this work, we introduce VisCodex, a…

Computation and Language · Computer Science 2025-08-14 Lingjie Jiang , Shaohan Huang , Xun Wu , Yixia Li , Dongdong Zhang , Furu Wei

Volumetric design is the first and critical step for professional building design, where architects not only depict the rough 3D geometry of the building but also specify the programs to form a 2D layout on each floor. Though 2D layout…

Machine Learning · Computer Science 2021-04-28 Kai-Hung Chang , Chin-Yi Cheng , Jieliang Luo , Shingo Murata , Mehdi Nourbakhsh , Yoshito Tsuji

Design-to-code translates high-fidelity UI designs into executable front-end implementations, but progress remains hard to compare due to inconsistent datasets, toolchains, and evaluation protocols. We introduce 1D-Bench, a benchmark…

Software Engineering · Computer Science 2026-02-24 Qiao Xu , Yipeng Yu , Chengxiao Feng , Xu Liu
‹ Prev 1 2 3 10 Next ›