Related papers: Efficiently Computing Compact Formal Explanations

VeriX: Towards Verified Explainability of Deep Neural Networks

We present VeriX (Verified eXplainability), a system for producing optimal robust explanations and generating counterfactuals along decision boundaries of machine learning models. We build such explanations and counterfactuals iteratively…

Machine Learning · Computer Science 2023-09-27 Min Wu , Haoze Wu , Clark Barrett

Towards Verified and Targeted Explanations through Formal Methods

As deep neural networks are deployed in safety-critical domains such as autonomous driving and medical diagnosis, stakeholders need explanations that are interpretable but also trustworthy with formal guarantees. Existing XAI methods fall…

Machine Learning · Computer Science 2026-04-17 Hanchen David Wang , Diego Manzanas Lopez , Preston K. Robinette , Ipek Oguz , Taylor T. Johnson , Meiyi Ma

Faster Verified Explanations for Neural Networks

Verified explanations are a principled way to explain the decisions taken by neural networks, which are otherwise black-box in nature. However, these techniques face significant scalability challenges, as they require multiple calls to…

Machine Learning · Computer Science 2026-05-11 Alessandro De Palma , Greta Dolcetti , Caterina Urban

Fast Explanations via Policy Gradient-Optimized Explainer

The challenge of delivering efficient explanations is a critical barrier that prevents the adoption of model explanations in real-world applications. Existing approaches often depend on extensive model queries for sample-level explanations…

Machine Learning · Computer Science 2026-03-10 Deng Pan , Nuno Moniz , Nitesh Chawla

Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training

Feature attribution methods highlight the important input tokens as explanations to model predictions, which have been widely applied to deep neural networks towards trustworthy AI. However, recent works show that explanations provided by…

Computation and Language · Computer Science 2024-01-01 Dongfang Li , Baotian Hu , Qingcai Chen , Shan He

Compact Proofs of Model Performance via Mechanistic Interpretability

We propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by…

Machine Learning · Computer Science 2024-12-25 Jason Gross , Rajashree Agrawal , Thomas Kwa , Euan Ong , Chun Hei Yip , Alex Gibson , Soufiane Noubir , Lawrence Chan

Probabilistic Verification of Fairness Properties via Concentration

As machine learning systems are increasingly used to make real world legal and financial decisions, it is of paramount importance that we develop algorithms to verify that these systems do not discriminate against minorities. We design a…

Artificial Intelligence · Computer Science 2020-01-01 Osbert Bastani , Xin Zhang , Armando Solar-Lezama

Causal Explanations for Image Classifiers

Existing algorithms for explaining the output of image classifiers use different definitions of explanations and a variety of techniques to find them. However, none of the existing tools use a principled approach based on formal definitions…

Artificial Intelligence · Computer Science 2026-02-23 Hana Chockler , David A. Kelly , Daniel Kroening , Youcheng Sun

Explaining, Fast and Slow: Abstraction and Refinement of Provable Explanations

Despite significant advancements in post-hoc explainability techniques for neural networks, many current methods rely on heuristics and do not provide formally provable guarantees over the explanations provided. Recent work has shown that…

Machine Learning · Computer Science 2025-06-11 Shahaf Bassan , Yizhak Yisrael Elboher , Tobias Ladner , Matthias Althoff , Guy Katz

Bound Propagation meets Constraint Simplification: Improving Logic-based XAI for Neural Networks

Logic-based methods for explaining neural network decisions offer formal guarantees of correctness and non-redundancy, but they often suffer from high computational costs, especially for large networks. In this work, we improve the…

Logic in Computer Science · Computer Science 2026-03-03 Ronaldo Gomes , Jairo Ribeiro , Luiz Queiroz , Thiago Alves Rocha

Plex: Towards Reliability using Pretrained Large Model Extensions

A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore…

Machine Learning · Computer Science 2022-07-18 Dustin Tran , Jeremiah Liu , Michael W. Dusenberry , Du Phan , Mark Collier , Jie Ren , Kehang Han , Zi Wang , Zelda Mariet , Huiyi Hu , Neil Band , Tim G. J. Rudner , Karan Singhal , Zachary Nado , Joost van Amersfoort , Andreas Kirsch , Rodolphe Jenatton , Nithum Thain , Honglin Yuan , Kelly Buchanan , Kevin Murphy , D. Sculley , Yarin Gal , Zoubin Ghahramani , Jasper Snoek , Balaji Lakshminarayanan

Compact Optimality Verification for Optimization Proxies

Recent years have witnessed increasing interest in optimization proxies, i.e., machine learning models that approximate the input-output mapping of parametric optimization problems and return near-optimal feasible solutions. Following…

Optimization and Control · Mathematics 2024-06-03 Wenbo Chen , Haoruo Zhao , Mathieu Tanneau , Pascal Van Hentenryck

ViTmiX: Vision Transformer Explainability Augmented by Mixed Visualization Methods

Recent advancements in Vision Transformers (ViT) have demonstrated exceptional results in various visual recognition tasks, owing to their ability to capture long-range dependencies in images through self-attention mechanisms. However, the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Eduard Hogea , Darian M. Onchis , Ana Coporan , Adina Magda Florea , Codruta Istin

ExClaim: Explainable Neural Claim Verification Using Rationalization

With the advent of deep learning, text generation language models have improved dramatically, with text at a similar level as human-written text. This can lead to rampant misinformation because content can now be created cheaply and…

Computation and Language · Computer Science 2023-01-24 Sai Gurrapu , Lifu Huang , Feras A. Batarseh

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

With the release of OpenAI's o1 model, reasoning models that adopt slow-thinking strategies have become increasingly common. Their outputs often contain complex reasoning, intermediate steps, and self-reflection, making existing evaluation…

Computation and Language · Computer Science 2026-01-01 Ding Chen , Qingchen Yu , Pengyuan Wang , Mengting Hu , Wentao Zhang , Zhengren Wang , Bo Tang , Feiyu Xiong , Xinchi Li , Chao Wang , Minchuan Yang , Zhiyu Li

Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks

With the rapid growth of machine learning, deep neural networks (DNNs) are now being used in numerous domains. Unfortunately, DNNs are "black-boxes", and cannot be interpreted by humans, which is a substantial concern in safety-critical…

Machine Learning · Computer Science 2023-02-10 Shahaf Bassan , Guy Katz

REX: Reasoning-aware and Grounded Explanation

Effectiveness and interpretability are two essential properties for trustworthy AI systems. Most recent studies in visual reasoning are dedicated to improving the accuracy of predicted answers, and less attention is paid to explaining the…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Shi Chen , Qi Zhao

Learning Quantifiable Visual Explanations Without Ground-Truth

Explainable AI (XAI) techniques are increasingly important for the validation and responsible use of modern deep learning models, but are difficult to evaluate due to the lack of good ground-truth to compare against. We propose a framework…

Artificial Intelligence · Computer Science 2026-05-19 Amritpal Singh , Andrey Barsky , Mohamed Ali Souibgui , Ernest Valveny , Dimosthenis Karatzas

From Robustness to Explainability and Back Again

Formal explainability guarantees the rigor of computed explanations, and so it is paramount in domains where rigor is critical, including those deemed high-risk. Unfortunately, since its inception formal explainability has been hampered by…

Artificial Intelligence · Computer Science 2024-12-04 Xuanxiang Huang , Joao Marques-Silva

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

Certified verification of transformer attention requires bounding the softmax function over interval constraints on the pre-softmax scores. Existing verifiers relax softmax ndependently of the downstream objective, leaving avoidable slack.…

Machine Learning · Computer Science 2026-05-13 Navid Rezazadeh , Arash Gholami Davoodi