Related papers: VDebugger: Harnessing Execution Feedback for Debug…

The Visual Debugger Tool

Debugging is an essential part of software maintenance and evolution since it allows software developers to analyze program execution step by step. Understanding a program is required to fix potential flaws, alleviate bottlenecks, and…

Software Engineering · Computer Science 2024-04-22 Tim Kräuter , Harald König , Adrian Rutle , Yngve Lamo

De-fine: Decomposing and Refining Visual Programs with Auto-Feedback

Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks. Unlike end-to-end models that need task-specific data, it advances in performing visual…

Computer Vision and Pattern Recognition · Computer Science 2024-08-06 Minghe Gao , Juncheng Li , Hao Fei , Liang Pang , Wei Ji , Guoming Wang , Zheqi Lv , Wenqiao Zhang , Siliang Tang , Yueting Zhuang

ViUniT: Visual Unit Tests for More Robust Visual Programming

Programming based approaches to reasoning tasks have substantially expanded the types of questions models can answer about visual scenes. Yet on benchmark visual reasoning data, when models answer correctly, they produce incorrect programs…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Artemis Panagopoulou , Honglu Zhou , Silvio Savarese , Caiming Xiong , Chris Callison-Burch , Mark Yatskar , Juan Carlos Niebles

VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning

Predicting program behavior and reasoning about code execution remain significant challenges in software engineering, particularly for large language models (LLMs) designed for code analysis. While these models excel at understanding static…

Software Engineering · Computer Science 2025-02-11 Cuong Chi Le , Hoang-Chau Truong-Vinh , Huy Nhat Phan , Dung Duy Le , Tien N. Nguyen , Nghi D. Q. Bui

Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step

Large language models (LLMs) are leading significant progress in code generation. Beyond one-pass code generation, recent works further integrate unit tests and program verifiers into LLMs to iteratively refine the generated programs.…

Software Engineering · Computer Science 2024-06-12 Li Zhong , Zilong Wang , Jingbo Shang

Agent That Debugs: Dynamic State-Guided Vulnerability Repair

In recent years, more vulnerabilities have been discovered every day, while manual vulnerability repair requires specialized knowledge and is time-consuming. As a result, many detected or even published vulnerabilities remain unpatched,…

Software Engineering · Computer Science 2025-04-11 Zhengyao Liu , Yunlong Ma , Jingxuan Xu , Junchen Ai , Xiang Gao , Hailong Sun , Abhik Roychoudhury

Coding with Eyes: Visual Feedback Unlocks Reliable GUI Code Generating and Debugging

Recent advances in Large Language Model (LLM)-based agents have shown remarkable progress in code generation. However, current agent methods mainly rely on text-output-based feedback (e.g. command-line outputs) for multi-round debugging and…

Software Engineering · Computer Science 2026-04-23 Zhilin Liu , Ye Huang , Ting Xie , Ruizhi Zhang , Wen Li , Lixin Duan

An Insight View of Kernel Visual Debugger in System Boot up

For many years, developers could not figure out the mystery of OS kernels. The main source of this mystery is the interaction between operating systems and hardware while system's boot up and kernel initialization. In addition, many…

Operating Systems · Computer Science 2012-11-21 Mohamed Farag

Towards a Neural Debugger for Python

Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team…

Machine Learning · Computer Science 2026-03-11 Maximilian Beck , Jonas Gehring , Jannik Kossen , Gabriel Synnaeve

An Introduction to Deep Visual Explanation

The practical impact of deep learning on complex supervised learning problems has been significant, so much so that almost every Artificial Intelligence problem, or at least a portion thereof, has been somehow recast as a deep learning…

Machine Learning · Statistics 2018-03-19 Housam Khalifa Bashier Babiker , Randy Goebel

Visualizing the Evaluation of Functional Programs for Debugging

In this position paper, we present a prototype of a visualizer for functional programs. Such programs, whose evaluation model is the reduction of an expression to a value through repeated application of rewriting rules, and which tend to…

Programming Languages · Computer Science 2024-11-04 John Whitington , Tom Ridge

Inferring and Executing Programs for Visual Reasoning

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases…

Computer Vision and Pattern Recognition · Computer Science 2017-05-11 Justin Johnson , Bharath Hariharan , Laurens van der Maaten , Judy Hoffman , Li Fei-Fei , C. Lawrence Zitnick , Ross Girshick

ViperGPT: Visual Inference via Python Execution for Reasoning

Answering visual queries is a complex task that requires both visual processing and reasoning. End-to-end models, the dominant approach for this task, do not explicitly differentiate between the two, limiting interpretability and…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Dídac Surís , Sachit Menon , Carl Vondrick

ViScratch: Using Large Language Models and Gameplay Videos for Automated Feedback in Scratch

Block-based programming environments such as Scratch are increasingly popular in programming education, in particular for young learners. While the use of blocks helps prevent syntax errors, semantic bugs remain common and difficult to…

Software Engineering · Computer Science 2025-09-16 Yuan Si , Daming Li , Hanyuan Shi , Jialu Zhang

TraceDiff: Debugging Unexpected Code Behavior Using Trace Divergences

Recent advances in program synthesis offer means to automatically debug student submissions and generate personalized feedback in massive programming classrooms. When automatically generating feedback for programming assignments, a key…

Human-Computer Interaction · Computer Science 2017-08-15 Ryo Suzuki , Gustavo Soares , Andrew Head , Elena Glassman , Ruan Reis , Melina Mongiovi , Loris D'Antoni , Bjoern Hartmann

RECODE: Reasoning Through Code Generation for Visual Question Answering

Multimodal Large Language Models (MLLMs) struggle with precise reasoning for structured visuals like charts and diagrams, as pixel-based perception lacks a mechanism for verification. To address this, we propose to leverage derendering --…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Junhong Shen , Mu Cai , Bo Hu , Ameet Talwalkar , David A Ross , Cordelia Schmid , Alireza Fathi

Debugging Defective Visualizations: Empirical Insights Informing a Human-AI Co-Debugging System

Visualization authoring is an iterative process requiring users to adjust parameters to achieve desired aesthetics. Due to its complexity, users often create defective visualizations and struggle to fix them. Many seek help on forums (e.g.,…

Human-Computer Interaction · Computer Science 2026-02-05 Shuyu Shen , Sirong Lu , Leixian Shen , Yuyu Luo

iNNspector: Visual, Interactive Deep Model Debugging

Deep learning model design, development, and debugging is a process driven by best practices, guidelines, trial-and-error, and the personal experiences of model developers. At multiple stages of this process, performance and internal model…

Human-Computer Interaction · Computer Science 2024-07-26 Thilo Spinner , Daniel Fürst , Mennatallah El-Assady

FVDebug: An LLM-Driven Debugging Assistant for Automated Root Cause Analysis of Formal Verification Failures

Debugging formal verification (FV) failures represents one of the most time-consuming bottlenecks in modern hardware design workflows. When properties fail, engineers must manually trace through complex counter-examples spanning multiple…

Hardware Architecture · Computer Science 2025-10-21 Yunsheng Bai , Ghaith Bany Hamad , Chia-Tung Ho , Syed Suhaib , Haoxing Ren

CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging

Debugging CUDA programs has long been challenging because failures often arise from subtle interactions among hardware behavior, compiler decisions, memory hierarchy, and asynchronous execution. More importantly, with the rapid expansion of…

Machine Learning · Computer Science 2026-05-27 Shiyang Li , Haoyang Chen , Mattia Fazzini , Caiwen Ding