Related papers: ViUniT: Visual Unit Tests for More Robust Visual P…

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

Visual programs are executable code generated by large language models to address visual reasoning problems. They decompose complex questions into multiple reasoning steps and invoke specialized models for each step to solve the problems.…

Computation and Language · Computer Science 2024-10-07 Xueqing Wu , Zongyu Lin , Songyan Zhao , Te-Lin Wu , Pan Lu , Nanyun Peng , Kai-Wei Chang

VisionUnite: A Vision-Language Foundation Model for Ophthalmology Enhanced with Clinical Knowledge

The need for improved diagnostic methods in ophthalmology is acute, especially in the underdeveloped regions with limited access to specialists and advanced equipment. Therefore, we introduce VisionUnite, a novel vision-language foundation…

Image and Video Processing · Electrical Eng. & Systems 2025-08-13 Zihan Li , Diping Song , Zefeng Yang , Deming Wang , Fei Li , Xiulan Zhang , Paul E. Kinahan , Yu Qiao

Probing Visual Language Priors in VLMs

Despite recent advances in Vision-Language Models (VLMs), they may over-rely on visual language priors existing in their training data rather than true visual reasoning. To investigate this, we introduce ViLP, a benchmark featuring…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Tiange Luo , Ang Cao , Gunhee Lee , Justin Johnson , Honglak Lee

A Metamodel of Unit Testing for Object-Oriented Programming Languages

A unit test is a method for verifying the accuracy and the proper functioning of a portion of a program. This work consists to study the relation and the approaches to test Object-Oriented Programming (OOP) programs and to propose a…

Programming Languages · Computer Science 2009-12-21 Martin Levesque

ViperGPT: Visual Inference via Python Execution for Reasoning

Answering visual queries is a complex task that requires both visual processing and reasoning. End-to-end models, the dominant approach for this task, do not explicitly differentiate between the two, limiting interpretability and…

Computer Vision and Pattern Recognition · Computer Science 2023-03-15 Dídac Surís , Sachit Menon , Carl Vondrick

Beyond Accuracy: Evaluating Grounded Visual Evidence in Thinking with Images

Despite the remarkable progress of Vision-Language Models (VLMs) in adopting "Thinking-with-Images" capabilities, accurately evaluating the authenticity of their reasoning process remains a critical challenge. Existing benchmarks mainly…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Xuchen Li , Xuzhao Li , Renjie Pi , Shiyu Hu , Jian Zhao , Jiahui Gao

Leveraging GPT-4 for Vulnerability-Witnessing Unit Test Generation

In the life-cycle of software development, testing plays a crucial role in quality assurance. Proper testing not only increases code coverage and prevents regressions but it can also ensure that any potential vulnerabilities in the software…

Software Engineering · Computer Science 2025-06-16 Gábor Antal , Dénes Bán , Martin Isztin , Rudolf Ferenc , Péter Hegedűs

Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests

Automated unit test generators, particularly search-based software testing tools like EvoSuite, are capable of generating tests with high coverage. Although these generators alleviate the burden of writing unit tests, they often pose…

Software Engineering · Computer Science 2024-08-22 Amirhossein Deljouyi , Roham Koohestani , Maliheh Izadi , Andy Zaidman

PropTest: Automatic Property Testing for Improved Visual Programming

Visual Programming has recently emerged as an alternative to end-to-end black-box visual reasoning models. This type of method leverages Large Language Models (LLMs) to generate the source code for an executable computer program that solves…

Computer Vision and Pattern Recognition · Computer Science 2024-07-24 Jaywon Koo , Ziyan Yang , Paola Cascante-Bonilla , Baishakhi Ray , Vicente Ordonez

Visual GUI testing in practice: An extended industrial case study

Context: Visual GUI testing (VGT) is referred to as the latest generation GUI-based testing. It is a tool-driven technique, which uses image recognition for interacting with and asserting the behavior of the system under test. Motivated by…

Software Engineering · Computer Science 2020-05-21 Vahid Garousi , Wasif Afzal , Adem Çağlar , İhsan Berk Işık , Berker Baydan , Seçkin Çaylak , Ahmet Zeki Boyraz , Burak Yolaçan , Kadir Herkiloğlu

AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study

Many software systems originate as prototypes or minimum viable products (MVPs), developed with an emphasis on delivery speed and responsiveness to changing requirements rather than long-term code maintainability. While effective for rapid…

Software Engineering · Computer Science 2026-04-06 Ema Smolic , Mario Brcic , Luka Hobor , Mihael Kovac

Revisiting Visual Question Answering Baselines

Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms…

Computer Vision and Pattern Recognition · Computer Science 2016-11-24 Allan Jabri , Armand Joulin , Laurens van der Maaten

UNIT: Unifying Image and Text Recognition in One Vision Encoder

Currently, vision encoder models like Vision Transformers (ViTs) typically excel at image recognition tasks but cannot simultaneously support text recognition like human visual recognition. To address this limitation, we propose UNIT, a…

Computer Vision and Pattern Recognition · Computer Science 2024-09-09 Yi Zhu , Yanpeng Zhou , Chunwei Wang , Yang Cao , Jianhua Han , Lu Hou , Hang Xu

VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation

Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation offers a path beyond testing by requiring models to produce not only executable code,…

Software Engineering · Computer Science 2026-05-12 Zichen Xie , Mrigank Pawagi , Yuxin Liu , Aaditi Rai , Lize Shao , John Berberian , Sicong Che , Wenxi Wang

UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling

Significant research efforts have been made to scale and improve vision-language model (VLM) training approaches. Yet, with an ever-growing number of benchmarks, researchers are tasked with the heavy burden of implementing each protocol,…

Computer Vision and Pattern Recognition · Computer Science 2024-08-12 Haider Al-Tahan , Quentin Garrido , Randall Balestriero , Diane Bouchacourt , Caner Hazirbas , Mark Ibrahim

TestART: Improving LLM-based Unit Testing via Co-evolution of Automated Generation and Repair Iteration

Unit testing is crucial for detecting bugs in individual program units but consumes time and effort. Recently, large language models (LLMs) have demonstrated remarkable capabilities in generating unit test cases. However, several problems…

Software Engineering · Computer Science 2025-04-01 Siqi Gu , Quanjun Zhang , Kecheng Li , Chunrong Fang , Fangyuan Tian , Liuchuan Zhu , Jianyi Zhou , Zhenyu Chen

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced. This is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Damien Teney , Peng Wang , Jiewei Cao , Lingqiao Liu , Chunhua Shen , Anton van den Hengel

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Vision-Language Models have made significant progress on many perception-focused tasks. However, their progress on reasoning-focused tasks remains limited due to the lack of high-quality and diverse training data. In this work, we aim to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Yiming Jia , Jiachen Li , Xiang Yue , Bo Li , Ping Nie , Kai Zou , Wenhu Chen

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks. In contrast to previous models, UViM has the same functional form for all tasks; it requires no task-specific modifications which require…

Computer Vision and Pattern Recognition · Computer Science 2022-10-17 Alexander Kolesnikov , André Susano Pinto , Lucas Beyer , Xiaohua Zhai , Jeremiah Harmsen , Neil Houlsby

RUBi: Reducing Unimodal Biases in Visual Question Answering

Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in…

Computer Vision and Pattern Recognition · Computer Science 2020-03-24 Remi Cadene , Corentin Dancette , Hedi Ben-younes , Matthieu Cord , Devi Parikh