Related papers: Self-Explaining Structures Improve NLP Models

SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers

We introduce SelfExplain, a novel self-explaining model that explains a text classifier's predictions using phrase-based concepts. SelfExplain augments existing neural classifiers by adding (1) a globally interpretable layer that identifies…

Computation and Language · Computer Science 2021-09-09 Dheeraj Rajagopal , Vidhisha Balachandran , Eduard Hovy , Yulia Tsvetkov

Not All Features Are Equal: Feature Leveling Deep Neural Networks for Better Interpretation

Self-explaining models are models that reveal decision making parameters in an interpretable manner so that the model reasoning process can be directly understood by human beings. General Linear Models (GLMs) are self-explaining because the…

Machine Learning · Computer Science 2019-05-31 Yingjing Lu , Runde Yang

Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction

Despite the recent progress in deep neural networks (DNNs), it remains challenging to explain the predictions made by DNNs. Existing explanation methods for DNNs mainly focus on post-hoc explanations where another explanatory model is…

Machine Learning · Computer Science 2024-01-04 Wei Qian , Chenxu Zhao , Yangyi Li , Fenglong Ma , Chao Zhang , Mengdi Huai

Self-explaining Neural Network with Concept-based Explanations for ICU Mortality Prediction

Complex deep learning models show high prediction tasks in various clinical prediction tasks but their inherent complexity makes it more challenging to explain model predictions for clinicians and healthcare providers. Existing research on…

Machine Learning · Computer Science 2026-02-06 Sayantan Kumar , Sean C. Yu , Thomas Kannampallil , Zachary Abrams , Andrew Michelson , Philip R. O. Payne

Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models

Current Natural Language Inference (NLI) models achieve impressive results, sometimes outperforming humans when evaluating on in-distribution test sets. However, as these models are known to learn from annotation artefacts and dataset…

Computation and Language · Computer Science 2022-10-24 Joe Stacey , Pasquale Minervini , Haim Dubossarsky , Marek Rei

Improving Network Interpretability via Explanation Consistency Evaluation

While deep neural networks have achieved remarkable performance, they tend to lack transparency in prediction. The pursuit of greater interpretability in neural networks often results in a degradation of their original performance. Some…

Computer Vision and Pattern Recognition · Computer Science 2024-08-09 Hefeng Wu , Hao Jiang , Keze Wang , Ziyi Tang , Xianghuan He , Liang Lin

Explaining, Evaluating and Enhancing Neural Networks' Learned Representations

Most efforts in interpretability in deep learning have focused on (1) extracting explanations of a specific downstream task in relation to the input features and (2) imposing constraints on the model, often at the expense of predictive…

Machine Learning · Computer Science 2022-02-22 Marco Bertolini , Djork-Arné Clevert , Floriane Montanari

Towards Robust Interpretability with Self-Explaining Neural Networks

Most recent work on interpretability of complex machine learning models has focused on estimating $\textit{a posteriori}$ explanations for previously trained models around specific predictions. $\textit{Self-explaining}$ models where…

Machine Learning · Computer Science 2018-12-05 David Alvarez-Melis , Tommi S. Jaakkola

Explaining Language Models' Predictions with High-Impact Concepts

The emergence of large-scale pretrained language models has posed unprecedented challenges in deriving explanations of why the model has made some predictions. Stemmed from the compositional nature of languages, spurious correlations have…

Computation and Language · Computer Science 2023-05-04 Ruochen Zhao , Shafiq Joty , Yongjie Wang , Tan Wang

Learning from Explanations with Neural Execution Tree

While deep neural networks have achieved impressive performance on a range of NLP tasks, these data-hungry models heavily rely on labeled data, which restricts their applications in scenarios where data annotation is expensive. Natural…

Computation and Language · Computer Science 2020-02-17 Ziqi Wang , Yujia Qin , Wenxuan Zhou , Jun Yan , Qinyuan Ye , Leonardo Neves , Zhiyuan Liu , Xiang Ren

Less is More: A Lightweight and Robust Neural Architecture for Discourse Parsing

Complex feature extractors are widely employed for text representation building. However, these complex feature extractors make the NLP systems prone to overfitting especially when the downstream training datasets are relatively small,…

Computation and Language · Computer Science 2023-09-11 Ming Li , Ruihong Huang

Interpreting Deep Learning Models in Natural Language Processing: A Review

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks. However, a long-standing criticism against neural network models is the lack of interpretability, which not only…

Computation and Language · Computer Science 2021-10-26 Xiaofei Sun , Diyi Yang , Xiaoya Li , Tianwei Zhang , Yuxian Meng , Han Qiu , Guoyin Wang , Eduard Hovy , Jiwei Li

Towards Explainable NLP: A Generative Explanation Framework for Text Classification

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning…

Computation and Language · Computer Science 2019-06-12 Hui Liu , Qingyu Yin , William Yang Wang

Learning Robust and Lightweight Model through Separable Structured Transformations

With the proliferation of mobile devices and the Internet of Things, deep learning models are increasingly deployed on devices with limited computing resources and memory, and are exposed to the threat of adversarial noise. Learning deep…

Computer Vision and Pattern Recognition · Computer Science 2021-12-30 Xian Wei , Yanhui Huang , Yangyu Xu , Mingsong Chen , Hai Lan , Yuanxiang Li , Zhongfeng Wang , Xuan Tang

A Framework to Learn with Interpretation

To tackle interpretability in deep learning, we present a novel framework to jointly learn a predictive model and its associated interpretation model. The interpreter provides both local and global interpretability about the predictive…

Machine Learning · Computer Science 2022-02-24 Jayneel Parekh , Pavlo Mozharovskyi , Florence d'Alché-Buc

Enhancing Interpretability for Vision Models via Shapley Value Optimization

Deep neural networks have demonstrated remarkable performance across various domains, yet their decision-making processes remain opaque. Although many explanation methods are dedicated to bringing the obscurity of DNNs to light, they…

Computer Vision and Pattern Recognition · Computer Science 2025-12-17 Kanglong Fan , Yunqiao Yang , Chen Ma

Sentence Ordering and Coherence Modeling using Recurrent Neural Networks

Modeling the structure of coherent texts is a key NLP problem. The task of coherently organizing a given set of sentences has been commonly used to build and evaluate models that understand such structure. We propose an end-to-end…

Computation and Language · Computer Science 2017-12-25 Lajanugen Logeswaran , Honglak Lee , Dragomir Radev

Model Explainability in Deep Learning Based Natural Language Processing

Machine learning (ML) model explainability has received growing attention, especially in the area related to model risk and regulations. In this paper, we reviewed and compared some popular ML model explainability methodologies, especially…

Artificial Intelligence · Computer Science 2021-06-15 Shafie Gholizadeh , Nengfeng Zhou

Do We Really Need GNNs with Explicit Structural Modeling? MLPs Suffice for Language Model Representations

Explicit structural information has been proven to be encoded by Graph Neural Networks (GNNs), serving as auxiliary knowledge to enhance model capabilities and improve performance in downstream NLP tasks. However, recent studies indicate…

Computation and Language · Computer Science 2025-06-30 Li Zhou , Hao Jiang , Junjie Li , Zefeng Zhao , Feng Jiang , Wenyu Chen , Haizhou Li

Towards a Framework for Evaluating Explanations in Automated Fact Verification

As deep neural models in NLP become more complex, and as a consequence opaque, the necessity to interpret them becomes greater. A burgeoning interest has emerged in rationalizing explanations to provide short and coherent justifications for…

Computation and Language · Computer Science 2024-05-21 Neema Kotonya , Francesca Toni