Related papers: Improving Counterexample Quality from Failed Progr…
Test or prove? These two approaches to software verification have long been presented as opposites. One is dynamic, the other static: a test executes the program, a proof only analyzes the program text. A different perspective is emerging,…
A successful automated program proof is, in software verification, the ultimate triumph. In practice, however, the road to such success is paved with many failed proof attempts. Unlike a failed test, which provides concrete evidence of an…
Counterfactual Explanations (CFEs) interpret machine learning models by identifying the smallest change to input features needed to change the model's prediction to a desired output. For classification tasks, CFEs determine how close a…
Applying deductive verification to formally prove that a program respects its formal specification is a very complex and time-consuming task due in particular to the lack of feedback in case of proof failures. Along with a non-compliance…
Software verification is a tedious process that involves the analysis of multiple failed verification attempts, and adjustments of the program or specification. This is especially the case for complex requirements, e.g., regarding security…
Large Language Models (LLMs) have shown promising results in automating formal verification. However, existing approaches treat proof generation as a static, end-to-end prediction over source code, relying on limited verifier feedback and…
Providing compact and understandable counterexamples for violated system properties is an essential task in model checking. Existing works on counterexamples for probabilistic systems so far computed either a large set of system runs or a…
Transformer models often exhibit brittle extrapolation, failing on inputs that are longer or structurally more complex than those seen during training. We introduce Counter-Example-Driven Curricula (CEDC), an automated framework that…
In model checking, when a given model fails to satisfy the desired specification, a typical model checker provides a counterexample that illustrates how the violation occurs. In general, there exist many diverse counterexamples that exhibit…
The proofs first generated by automated theorem provers are far from optimal by any measure of simplicity. In this paper I describe a technique for simplifying automated proofs. Hopefully this discussion will stimulate interest in the…
AI agents have shown initial promise in automating mathematical theorem proving in proof assistants such as Lean. The same proof assistants can be used to verify the correctness of code by pairing code with specifications and proofs that…
While language models are increasingly more proficient at code generation, they still frequently generate incorrect programs. Many of these programs are obviously wrong, but others are more subtle and pass weaker correctness checks such as…
Counterfactual explanations (CEs) offer a human-understandable way to explain decisions by identifying specific changes to the input parameters of a base or present model that would lead to a desired change in the outcome. For optimization…
Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative…
Identifying the cause of a proof failure during deductive verification of programs is hard: it may be due to an incorrectness in the program, an incompleteness in the program annotations, or an incompleteness of the prover. The changes…
In deductive verification and software model checking, dealing with certain specification language constructs can be problematic when the back-end solver is not sufficiently powerful or lacks the required theories. One way to deal with this…
In this paper we introduce a novel way to speed up the discovery of counterexamples in bounded model checking, based on parallel runs over versions of a system in which features have been randomly disabled. As shown in previous work, adding…
Verification and validation of cyber-physical systems (CPS) via large-scale simulation often surface failures that are hard to interpret, especially when triggered by interactions between continuous and discrete behaviors at specific events…
The main challenge in using abstractions effectively, is to construct a suitable abstraction for the system being verified. One approach that tries to address this problem is that of {\it counterexample guided abstraction-refinement…
Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class. These explanations are commonly used to guide users'…