Related papers: Compact Example-Based Explanations for Language Mo…

RelatIF: Identifying Explanatory Training Examples via Relative Influence

In this work, we focus on the use of influence functions to identify relevant training examples that one might hope "explain" the predictions of a machine learning model. One shortcoming of influence functions is that the training examples…

Machine Learning · Computer Science 2020-03-27 Elnaz Barshan , Marc-Etienne Brunet , Gintare Karolina Dziugaite

Evaluating Explanations: How much do explanations from the teacher aid students?

While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of…

Computation and Language · Computer Science 2021-12-20 Danish Pruthi , Rachit Bansal , Bhuwan Dhingra , Livio Baldini Soares , Michael Collins , Zachary C. Lipton , Graham Neubig , William W. Cohen

Training Data Influence Analysis and Estimation: A Survey

Good models require good training data. For overparameterized deep models, the causal relationship between training data and model predictions is increasingly opaque and poorly understood. Influence analysis partially demystifies training's…

Machine Learning · Computer Science 2024-04-02 Zayd Hammoudeh , Daniel Lowd

On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation

In the recent advances of natural language processing, the scale of the state-of-the-art models and datasets is usually extensive, which challenges the application of sample-based explanation methods in many aspects, such as explanation…

Computation and Language · Computer Science 2021-06-10 Wei Zhang , Ziming Huang , Yada Zhu , Guangnan Ye , Xiaodong Cui , Fan Zhang

Rethinking Explanation Evaluation under the Retraining Scheme

Feature attribution has gained prominence as a tool for explaining model decisions, yet evaluating explanation quality remains challenging due to the absence of ground-truth explanations. To circumvent this, explanation-guided input…

Machine Learning · Computer Science 2025-11-12 Yi Cai , Thibaud Ardoin , Mayank Gulati , Gerhard Wunder

Influence-driven Curriculum Learning for Pre-training on Limited Data

Curriculum learning, a training technique where data is presented to the model in order of example difficulty (e.g., from simpler to more complex documents), has shown limited success for pre-training language models. In this work, we…

Computation and Language · Computer Science 2025-09-29 Loris Schoenegger , Lukas Thoma , Terra Blevins , Benjamin Roth

Towards Understanding the Influence of Training Samples on Explanations

Explainable AI (XAI) is widely used to analyze AI systems' decision-making, such as providing counterfactual explanations for recourse. When unexpected explanations occur, users may want to understand the training data properties shaping…

Machine Learning · Computer Science 2025-03-26 André Artelt , Barbara Hammer

Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings

Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the…

Computation and Language · Computer Science 2024-03-11 Wei Zhou , Heike Adel , Hendrik Schuff , Ngoc Thang Vu

Extractive Explanations for Interpretable Text Ranking

Neural document ranking models perform impressively well due to superior language understanding gained from pre-training tasks. However, due to their complexity and large number of parameters, these (typically transformer-based) models are…

Information Retrieval · Computer Science 2022-12-02 Jurek Leonhardt , Koustav Rudra , Avishek Anand

Learning from Sufficient Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation Strategies

Human explanations of natural language, rationales, form a tool to assess whether models learn a label for the right reasons or rely on dataset-specific shortcuts. Sufficiency is a common metric for estimating the informativeness of…

Computation and Language · Computer Science 2025-11-21 Jonathan Kamp , Lisa Beinborn , Antske Fokkens

How to Achieve Higher Accuracy with Less Training Points?

In the era of large-scale model training, the extensive use of available datasets has resulted in significant computational inefficiencies. To tackle this issue, we explore methods for identifying informative subsets of training data that…

Machine Learning · Computer Science 2025-04-21 Jinghan Yang , Anupam Pani , Yunchao Zhang

Feature Importance Depends on Properties of the Data: Towards Choosing the Correct Explanations for Your Data and Decision Trees based Models

In order to ensure the reliability of the explanations of machine learning models, it is crucial to establish their advantages and limits and in which case each of these methods outperform. However, the current understanding of when and how…

Machine Learning · Computer Science 2025-02-12 Célia Wafa Ayad , Thomas Bonnier , Benjamin Bosch , Sonali Parbhoo , Jesse Read

Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions

The increasing complexity of foundational models underscores the necessity for explainability, particularly for fine-tuning, the most widely used training method for adapting models to downstream tasks. Instance attribution, one type of…

Machine Learning · Computer Science 2024-06-10 Jingtan Wang , Xiaoqiang Lin , Rui Qiao , Chuan-Sheng Foo , Bryan Kian Hsiang Low

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this question by…

Computation and Language · Computer Science 2023-03-27 Yanai Elazar , Nora Kassner , Shauli Ravfogel , Amir Feder , Abhilasha Ravichander , Marius Mosbach , Yonatan Belinkov , Hinrich Schütze , Yoav Goldberg

Model-specific Data Subsampling with Influence Functions

Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the…

Machine Learning · Computer Science 2020-10-21 Anant Raj , Cameron Musco , Lester Mackey , Nicolo Fusi

Approximating Score-based Explanation Techniques Using Conformal Regression

Score-based explainable machine-learning techniques are often used to understand the logic behind black-box models. However, such explanation techniques are often computationally expensive, which limits their application in time-critical…

Machine Learning · Computer Science 2023-08-24 Amr Alkhatib , Henrik Boström , Sofiane Ennadir , Ulf Johansson

Concept Matching for Low-Resource Classification

We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input…

Machine Learning · Computer Science 2020-06-02 Federico Errica , Ludovic Denoyer , Bora Edizel , Fabio Petroni , Vassilis Plachouras , Fabrizio Silvestri , Sebastian Riedel

Training Deep Models to be Explained with Fewer Examples

Although deep models achieve high predictive performance, it is difficult for humans to understand the predictions they made. Explainability is important for real-world applications to justify their reliability. Many example-based…

Machine Learning · Statistics 2021-12-08 Tomoharu Iwata , Yuya Yoshikawa

Can language models learn from explanations in context?

Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, explanations that connect examples to task principles can improve learning. We therefore investigate whether explanations of few-shot examples…

Computation and Language · Computer Science 2022-10-11 Andrew K. Lampinen , Ishita Dasgupta , Stephanie C. Y. Chan , Kory Matthewson , Michael Henry Tessler , Antonia Creswell , James L. McClelland , Jane X. Wang , Felix Hill

Investigating Training and Generalization in Faithful Self-Explanations of Large Language Models

Large language models have the potential to generate explanations for their own predictions in a variety of styles based on user instructions. Recent research has examined whether these self-explanations faithfully reflect the models'…

Computation and Language · Computer Science 2025-12-09 Tomoki Doi , Masaru Isonuma , Hitomi Yanaka