Related papers: Explaining Classes through Word Attribution

Looking Deeper into Deep Learning Model: Attribution-based Explanations of TextCNN

Layer-wise Relevance Propagation (LRP) and saliency maps have been recently used to explain the predictions of Deep Learning models, specifically in the domain of text classification. Given different attribution-based explanations to…

Information Retrieval · Computer Science 2018-12-04 Wenting Xiong , Iftitahu Ni'mah , Juan M. G. Huesca , Werner van Ipenburg , Jan Veldsink , Mykola Pechenizkiy

Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training

Feature attribution methods highlight the important input tokens as explanations to model predictions, which have been widely applied to deep neural networks towards trustworthy AI. However, recent works show that explanations provided by…

Computation and Language · Computer Science 2024-01-01 Dongfang Li , Baotian Hu , Qingcai Chen , Shan He

"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach

Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to…

Computation and Language · Computer Science 2017-11-01 Leila Arras , Franziska Horn , Grégoire Montavon , Klaus-Robert Müller , Wojciech Samek

Neural Text Classification by Jointly Learning to Cluster and Align

Distributional text clustering delivers semantically informative representations and captures the relevance between each word and semantic clustering centroids. We extend the neural text clustering approach to text classification tasks by…

Computation and Language · Computer Science 2020-11-25 Yekun Chai , Haidong Zhang , Shuo Jin

An Additive Instance-Wise Approach to Multi-class Model Interpretation

Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. A large number of interpreting methods focus on identifying explanatory input features, which generally fall into two main…

Machine Learning · Computer Science 2023-06-02 Vy Vo , Van Nguyen , Trung Le , Quan Hung Tran , Gholamreza Haffari , Seyit Camtepe , Dinh Phung

Selective Explanations

Feature attribution methods explain black-box machine learning (ML) models by assigning importance scores to input features. These methods can be computationally expensive for large ML models. To address this challenge, there has been…

Computers and Society · Computer Science 2024-05-31 Lucas Monteiro Paes , Dennis Wei , Flavio P. Calmon

Class Vectors: Embedding representation of Document Classes

Distributed representations of words and paragraphs as semantic embeddings in high dimensional data are used across a number of Natural Language Understanding tasks such as retrieval, translation, and classification. In this work, we…

Computation and Language · Computer Science 2015-08-04 Devendra Singh Sachan , Shailesh Kumar

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box

Attribution methods shed light on the explainability of data-driven approaches such as deep learning models by uncovering the most influential features in a to-be-explained decision. While determining feature attributions via gradients…

Machine Learning · Computer Science 2024-05-15 Yi Cai , Gerhard Wunder

Incorporating Priors with Feature Attribution on Text Classification

Feature attribution methods, proposed recently, help users interpret the predictions of complex models. Our approach integrates feature attributions into the objective function to allow machine learning practitioners to incorporate priors…

Computation and Language · Computer Science 2019-06-21 Frederick Liu , Besim Avci

Understanding Text Classification Data and Models Using Aggregated Input Salience

Realizing when a model is right for a wrong reason is not trivial and requires a significant effort by model developers. In some cases an input salience method, which highlights the most important parts of the input, may reveal problematic…

Computation and Language · Computer Science 2023-01-12 Sebastian Ebert , Alice Shoshana Jakobovits , Katja Filippova

Explaining word embeddings with perfect fidelity: Case study in research impact prediction

The best-performing approaches for scholarly document quality prediction are based on embedding models. In addition to their performance when used in classifiers, embedding models can also provide predictions even for words that were not…

Computation and Language · Computer Science 2025-08-29 Lucie Dvorackova , Marcin P. Joachimiak , Michal Cerny , Adriana Kubecova , Vilem Sklenak , Tomas Kliegr

X-Class: Text Classification with Extremely Weak Supervision

In this paper, we explore text classification with extremely weak supervision, i.e., only relying on the surface text of class names. This is a more challenging setting than the seed-driven weak supervision, which allows a few seed words…

Computation and Language · Computer Science 2022-02-09 Zihan Wang , Dheeraj Mekala , Jingbo Shang

Generating visual explanations from deep networks using implicit neural representations

Explaining deep learning models in a way that humans can easily understand is essential for responsible artificial intelligence applications. Attribution methods constitute an important area of explainable deep learning. The attribution…

Computer Vision and Pattern Recognition · Computer Science 2025-01-22 Michal Byra , Henrik Skibbe

A Multiplicative Model for Learning Distributed Text-Based Attribute Representations

In this paper we propose a general framework for learning distributed representations of attributes: characteristics of text whose representations can be jointly learned with word embeddings. Attributes can correspond to document indicators…

Machine Learning · Computer Science 2014-06-12 Ryan Kiros , Richard S. Zemel , Ruslan Salakhutdinov

Classification via Incoherent Subspaces

This article presents a new classification framework that can extract individual features per class. The scheme is based on a model of incoherent subspaces, each one associated to one class, and a model on how the elements in a class are…

Computer Vision and Pattern Recognition · Computer Science 2010-05-11 Karin Schnass , Pierre Vandergheynst

Seeing in Words: Learning to Classify through Language Bottlenecks

Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Khalid Saifullah , Yuxin Wen , Jonas Geiping , Micah Goldblum , Tom Goldstein

Hybrid Attribution Priors for Explainable and Robust Model Training

Small language models (SLMs) are widely used in tasks that require low latency and lightweight deployment, particularly classification. As interpretability and robustness gain increasing importance, explanation-guided learning has emerged…

Machine Learning · Computer Science 2025-12-18 Zhuoran Zhang , Feng Zhang , Shangyuan Li , Yang Shi , Yuanxing Zhang , Wei Chen , Tengjiao Wang , Kam-Fai Wong

TF-CR: Weighting Embeddings for Text Classification

Text classification, as the task consisting in assigning categories to textual instances, is a very common task in information science. Methods learning distributed representations of words, such as word embeddings, have become popular in…

Computation and Language · Computer Science 2020-12-15 Arkaitz Zubiaga

Classifying text using machine learning models and determining conversation drift

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories

Attribute-based recognition models, due to their impressive performance and their ability to generalize well on novel categories, have been widely adopted for many computer vision applications. However, usually both the attribute vocabulary…

Computer Vision and Pattern Recognition · Computer Science 2017-04-13 Ziad Al-Halah , Rainer Stiefelhagen