Related papers: Invariant Rationalization

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Extracting a small subset of crucial rationales from the full input is a key problem in explainability research. The most widely used fundamental criterion for rationale extraction is the maximum mutual information (MMI) criterion. In this…

Artificial Intelligence · Computer Science 2025-03-11 Wei Liu , Zhiying Deng , Zhongyu Niu , Jun Wang , Haozhao Wang , Zhigang Zeng , Ruixuan Li

Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization

An important line of research in the field of explainability is to extract a small subset of crucial rationales from the full input. The most widely used criterion for rationale extraction is the maximum mutual information (MMI) criterion.…

Machine Learning · Computer Science 2024-10-23 Wei Liu , Zhiying Deng , Zhongyu Niu , Jun Wang , Haozhao Wang , YuanKai Zhang , Ruixuan Li

Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control

Selective rationalization has become a common mechanism to ensure that predictive models reveal how they use any available features. The selection may be soft or hard, and identifies a subset of input features relevant for prediction. The…

Computation and Language · Computer Science 2019-12-17 Mo Yu , Shiyu Chang , Yang Zhang , Tommi S. Jaakkola

Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information

Providing natural language-based explanations to justify recommendations helps to improve users' satisfaction and gain users' trust. However, as current explanation generation methods are commonly trained with an objective to mimic existing…

Information Retrieval · Computer Science 2024-08-22 Yurou Zhao , Yiding Sun , Ruidong Han , Fei Jiang , Lu Guan , Xiang Li , Wei Lin , Weizhi Ma , Jiaxin Mao

On Mutual Information Maximization for Representation Learning

Many recent methods for unsupervised or self-supervised representation learning train feature extractors by maximizing an estimate of the mutual information (MI) between different views of the data. This comes with several immediate…

Machine Learning · Computer Science 2020-01-24 Michael Tschannen , Josip Djolonga , Paul K. Rubenstein , Sylvain Gelly , Mario Lucic

Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii

Mechanistic Interpretability (MI) aims to understand neural networks through causal explanations. Though MI has many explanation-generating methods, progress has been limited by the lack of a universal approach to evaluating explanations.…

Machine Learning · Computer Science 2025-05-05 Kola Ayonrinde , Louis Jaburi

The Irrationality of Neural Rationale Models

Neural rationale models are popular for interpretable predictions of NLP tasks. In these, a selector extracts segments of the input text, called rationales, and passes these segments to a classifier for prediction. Since the rationale is…

Computation and Language · Computer Science 2022-07-26 Yiming Zheng , Serena Booth , Julie Shah , Yilun Zhou

Towards Trustworthy Explanation: On Causal Rationalization

With recent advances in natural language processing, rationalization becomes an essential self-explaining diagram to disentangle the black box by selecting a subset of input texts to account for the major variation in prediction. Yet,…

Machine Learning · Computer Science 2023-09-12 Wenbo Zhang , Tong Wu , Yunlong Wang , Yong Cai , Hengrui Cai

D-Separation for Causal Self-Explanation

Rationalization is a self-explaining framework for NLP models. Conventional work typically uses the maximum mutual information (MMI) criterion to find the rationale that is most indicative of the target label. However, this criterion can be…

Artificial Intelligence · Computer Science 2023-11-01 Wei Liu , Jun Wang , Haozhao Wang , Ruixuan Li , Zhiying Deng , YuanKai Zhang , Yang Qiu

Rationales for Sequential Predictions

Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We find…

Computation and Language · Computer Science 2021-11-19 Keyon Vafa , Yuntian Deng , David M. Blei , Alexander M. Rush

A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i

Mechanistic Interpretability aims to understand neural networks through causal explanations. We argue for the Explanatory View Hypothesis: that Mechanistic Interpretability research is a principled approach to understanding models because…

Machine Learning · Computer Science 2025-05-05 Kola Ayonrinde , Louis Jaburi

Unsupervised Selective Rationalization with Noise Injection

A major issue with using deep learning models in sensitive applications is that they provide no explanation for their output. To address this problem, unsupervised selective rationalization produces rationales alongside predictions by…

Computation and Language · Computer Science 2023-05-30 Adam Storek , Melanie Subbiah , Kathleen McKeown

Rationalization through Concepts

Automated predictions require explanations to be interpretable by humans. One type of explanation is a rationale, i.e., a selection of input features such as relevant text snippets from which the model computes the outcome. However, a…

Computation and Language · Computer Science 2021-05-12 Diego Antognini , Boi Faltings

Model Interpretability and Rationale Extraction by Input Mask Optimization

Concurrent to the rapid progress in the development of neural-network based models in areas like natural language processing and computer vision, the need for creating explanations for the predictions of these black-box models has risen…

Computation and Language · Computer Science 2025-08-18 Marc Brinner , Sina Zarriess

Flexibly-bounded Rationality and Marginalization of Irrationality Theories for Decision Making

In this paper the theory of flexibly-bounded rationality which is an extension to the theory of bounded rationality is revisited. Rational decision making involves using information which is almost always imperfect and incomplete together…

Artificial Intelligence · Computer Science 2013-06-11 Tshilidzi Marwala

Computational Rationalization: The Inverse Equilibrium Problem

Modeling the purposeful behavior of imperfect agents from a small number of observations is a challenging task. When restricted to the single-agent decision-theoretic setting, inverse optimal control techniques assume that observed behavior…

Computer Science and Game Theory · Computer Science 2013-08-19 Kevin Waugh , Brian D. Ziebart , J. Andrew Bagnell

MGR: Multi-generator Based Rationalization

Rationalization is to employ a generator and a predictor to construct a self-explaining NLP model in which the generator selects a subset of human-intelligible pieces of the input text to the following predictor. However, rationalization…

Machine Learning · Computer Science 2023-07-25 Wei Liu , Haozhao Wang , Jun Wang , Ruixuan Li , Xinyang Li , Yuankai Zhang , Yang Qiu

Distribution Matching for Rationalization

The task of rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks. By definition, rationales represent key text pieces used for prediction and thus should have…

Computation and Language · Computer Science 2021-06-02 Yongfeng Huang , Yujun Chen , Yulun Du , Zhilin Yang

Computational Rationalization: The Inverse Equilibrium Problem

Modeling the purposeful behavior of imperfect agents from a small number of observations is a challenging task. When restricted to the single-agent decision-theoretic setting, inverse optimal control techniques assume that observed behavior…

Computer Science and Game Theory · Computer Science 2015-03-19 Kevin Waugh , Brian D. Ziebart , J. Andrew Bagnell

Explaining Representation by Mutual Information

As interpretability gains attention in machine learning, there is a growing need for reliable models that fully explain representation content. We propose a mutual information (MI)-based method that decomposes neural network representations…

Machine Learning · Computer Science 2025-04-22 Lifeng Gu