Related papers: Data Synthesis for Testing Black-Box Machine Learn…
This article proposes a test procedure that can be used to test ML models and ML-based systems independently of the actual training process. In this way, the typical quality statements such as accuracy and precision of these models and…
Today, machine learning (ML) models are increasingly applied in decision making. This induces an urgent need for quality assurance of ML models with respect to (often domain-dependent) requirements. Monotonicity is one such requirement. It…
Auditing the use of data in training machine-learning (ML) models is an increasingly pressing challenge, as myriad ML practitioners routinely leverage the effort of content creators to train models without their permission. In this paper,…
Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive…
Ensuring the generalisability of clinical machine learning (ML) models across diverse healthcare settings remains a significant challenge due to variability in patient demographics, disease prevalence, and institutional practices. Existing…
With widespread adoption of AI models for important decision making, ensuring reliability of such models remains an important challenge. In this paper, we present an end-to-end generic framework for testing AI Models which performs…
Recently, machine and deep learning (ML/DL) algorithms have been increasingly adopted in many software systems. Due to their inductive nature, ensuring the quality of these systems remains a significant challenge for the research community.…
Training large language models (LLMs) for external tool usage is a rapidly expanding field, with recent research focusing on generating synthetic data to address the shortage of available data. However, the absence of systematic data…
Accurately evaluating model performance is crucial for deploying machine learning systems in real-world applications. Traditional methods often require a sufficiently large labeled test set to ensure a reliable evaluation. However, in many…
Component-based software development has posed a serious challenge to system verification since externally-obtained components could be a new source of system failures. This issue can not be completely solved by either model-checking or…
Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program…
Test collections are crucial for evaluating Information Retrieval (IR) systems. Creating a diverse set of user queries for these collections can be challenging, and obtaining relevance judgments, which indicate how well retrieved documents…
Novel deep-learning (DL) architectures have reached a level where they can generate digital media, including photorealistic images, that are difficult to distinguish from real data. These technologies have already been used to generate…
The quality and correct functioning of software components embedded in electronic systems are of utmost concern especially for safety and mission-critical systems. Model-based testing and formal verification techniques can be employed to…
The advancement of Artificial Intelligence (AI) has created opportunities for e-learning, particularly in automated assessment systems that reduce educators' workload and provide timely feedback to students. However, developing effective…
When developing text classification models for real world applications, one major challenge is the difficulty to collect sufficient data for all text classes. In this work, we address this challenge by utilizing large language models (LLMs)…
Assessing and improving the quality of data in data-intensive systems are fundamental challenges that have given rise to numerous applications targeting transformation and cleaning of data. However, while schema design, data cleaning, and…
The integration of machine learning (ML) models enhances the efficiency, affordability, and reliability of feature detection in microscopy, yet their development and applicability are hindered by the dependency on scarce and often flawed…
Despite the tremendous advances that have been made in the last decade on developing useful machine-learning applications, their wider adoption has been hindered by the lack of strong assurance guarantees that can be made about their…
Testing in production-like test environments is an essential part of quality assurance processes in many industries. Provisioning of such test environments, for information-intensive services, involves setting up databases that are…