English
Related papers

Related papers: Testing Framework for Black-box AI Models

200 papers

The last decade has seen tremendous progress in AI technology and applications. With such widespread adoption, ensuring the reliability of the AI models is crucial. In past, we took the first step of creating a testing framework called…

Artificial Intelligence · Computer Science 2021-10-08 Swagatam Haldar , Deepak Vijaykeerthy , Diptikalyan Saha

This article proposes a test procedure that can be used to test ML models and ML-based systems independently of the actual training process. In this way, the typical quality statements such as accuracy and precision of these models and…

Machine Learning · Computer Science 2024-06-21 Hans-Werner Wiesbrock , Jürgen Großmann

As AI systems advance and integrate into society, well-designed and transparent evaluations are becoming essential tools in AI governance, informing decisions by providing evidence about system capabilities and risks. Yet there remains a…

The quality and correct functioning of software components embedded in electronic systems are of utmost concern especially for safety and mission-critical systems. Model-based testing and formal verification techniques can be employed to…

Formal Languages and Automata Theory · Computer Science 2019-01-08 Shahbaz Ali , Hailong Sun , Yongwang Zhao

Generative AI (GenAI) models have become vital across industries, yet current evaluation methods have not adapted to their widespread use. Traditional evaluations often rely on benchmarks and fixed datasets, frequently failing to reflect…

AI systems, in particular with deep learning techniques, have demonstrated superior performance for various real-world applications. Given the need for tailored optimization in specific scenarios, as well as the concerns related to the…

Artificial Intelligence · Computer Science 2024-11-12 Zhiyu Zhu , Zhibo Jin , Hongsheng Hu , Minhui Xue , Ruoxi Sun , Seyit Camtepe , Praveen Gauravaram , Huaming Chen

In black-box testing of GUI applications (a form of system testing), a dynamic analysis of the GUI application is used to infer a black-box model; the black-box model is then used to derive test cases for the test of the GUI application. In…

Software Engineering · Computer Science 2012-10-18 Stephan Arlt , Evren Ermis , Sergio Feo-Arenis , Andreas Podelski

Model checking is an established technique to formally verify automation systems which are required to be trusted. However, for sufficiently complex systems model checking becomes computationally infeasible. On the other hand, testing,…

Software Engineering · Computer Science 2019-07-30 Igor Buzhinsky , Valeriy Vyatkin

Artificial intelligence (AI) holds great promise for supporting clinical trials, from patient recruitment and endpoint assessment to treatment response prediction. However, deploying AI without safeguards poses significant risks,…

Machine Learning · Computer Science 2025-10-09 Yao Chen , David Ohlssen , Aimee Readie , Gregory Ligozio , Ruvie Martin , Thibaud Coroller

Modern AI systems increasingly comprise multiple interconnected neural networks to tackle complex inference tasks. Testing such systems for robustness and safety entails significant challenges. Current state-of-the-art robustness testing…

Artificial Intelligence · Computer Science 2026-01-28 Sayak Chowdhury , Meenakshi D'Souza

Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We will describe and critically assess the different ways AI systems…

Artificial Intelligence · Computer Science 2016-08-23 Jose Hernandez-Orallo

Software testing remains critical for ensuring reliability, yet traditional approaches are slow, costly, and prone to gaps in coverage. This paper presents an AI-driven framework that automates test case generation and validation using…

Software Engineering · Computer Science 2025-08-25 Saba Naqvi , Mohammad Baqar

While the capabilities and utility of AI systems have advanced, rigorous norms for evaluating these systems have lagged. Grand claims, such as models achieving general reasoning capabilities, are supported with model performance on narrow…

Test and evaluation is a necessary process for ensuring that engineered systems perform as intended under a variety of conditions, both expected and unexpected. In this work, we consider the unique challenges of developing a unifying test…

Systems and Control · Electrical Eng. & Systems 2022-01-21 Erin Lanus , Ivan Hernandez , Adam Dachowicz , Laura Freeman , Melanie Grande , Andrew Lang , Jitesh H. Panchal , Anthony Patrick , Scott Welch

We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test. The framework verifies any trained model, regardless of its training process and architecture. Thus, it allows…

Artificial Intelligence · Computer Science 2021-06-28 Shahar Segal , Yossi Adi , Benny Pinkas , Carsten Baum , Chaya Ganesh , Joseph Keshet

Edge artificial intelligence (AI) will be a central part of 6G, with powerful edge servers supporting devices in performing machine learning (ML) inference. However, it is challenging to deliver the latency and accuracy guarantees required…

Information Theory · Computer Science 2025-06-16 Anders E. Kalør , Tomoaki Ohtsuki

We can never be certain that a software system is correct simply by testing it, but with every additional successful test we become less uncertain about its correctness. In absence of source code or elaborate specifications and models,…

Software Engineering · Computer Science 2016-08-11 Neil Walkinshaw , Gordon Fraser

As AI models scale to billions of parameters and operate with increasing autonomy, ensuring their safe, reliable operation demands engineering-grade security and assurance frameworks. This paper presents an enterprise-level, risk-aware,…

Cryptography and Security · Computer Science 2025-05-13 Krti Tallam

The increasing usage of machine learning models raises the question of the reliability of these models. The current practice of testing with limited data is often insufficient. In this paper, we provide a framework for automated test data…

Machine Learning · Computer Science 2021-11-04 Diptikalyan Saha , Aniya Aggarwal , Sandeep Hans

AI assistants can increasingly generate and evolve test cases. The challenge is no longer merely to produce them, but also to help engineers understand why a generated artefact exists and what supports it. Existing work has focused on…

Software Engineering · Computer Science 2026-04-27 Eduard Paul Enoiu , Robert Feldt
‹ Prev 1 2 3 10 Next ›