Related papers: Enhancing Training Data Attribution with Represent…

Daunce: Data Attribution through Uncertainty Estimation

Training data attribution (TDA) methods aim to identify which training examples influence a model's predictions on specific test data most. By quantifying these influences, TDA supports critical applications such as data debugging,…

Machine Learning · Computer Science 2025-05-30 Xingyuan Pan , Chenlu Ye , Joseph Melkonian , Jiaqi W. Ma , Tong Zhang

Exploring Training Data Attribution under Limited Access Constraints

Training data attribution (TDA) plays a critical role in understanding the influence of individual training data points on model predictions. Gradient-based TDA methods, popularized by \textit{influence function} for their superior…

Machine Learning · Computer Science 2025-09-17 Shiyuan Zhang , Junwei Deng , Juhan Bae , Jiaqi Ma

Training Data Attribution via Approximate Unrolled Differentiation

Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be…

Machine Learning · Computer Science 2024-05-22 Juhan Bae , Wu Lin , Jonathan Lorraine , Roger Grosse

Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution

As large language models are increasingly trained and fine-tuned, practitioners need methods to identify which training data drive specific behaviors, particularly unintended ones. Training Data Attribution (TDA) methods address this by…

Artificial Intelligence · Computer Science 2026-02-17 Matthew Kowal , Goncalo Paulo , Louis Jaburi , Tom Tseng , Lev E McKinney , Stefan Heimersheim , Aaron David Tucker , Adam Gleave , Kellin Pelrine

Efficient Ensembles Improve Training Data Attribution

Training data attribution (TDA) methods aim to quantify the influence of individual training data points on the model predictions, with broad applications in data-centric AI, such as mislabel detection, data selection, and copyright…

Machine Learning · Computer Science 2024-05-28 Junwei Deng , Ting-Wei Li , Shichang Zhang , Jiaqi Ma

LoRIF: Low-Rank Influence Functions for Scalable Training Data Attribution

Training data attribution (TDA) identifies which training examples most influenced a model's prediction. Influence function methods are a theoretically grounded family of TDA methods and exploit gradients. To overcome the scalability…

Machine Learning · Computer Science 2026-05-15 Shuangqi Li , Hieu Le , Jingyi Xu , Mathieu Salzmann

A Bayesian Information-Theoretic Approach to Data Attribution

Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the…

Machine Learning · Computer Science 2026-04-10 Dharmesh Tailor , Nicolò Felicioni , Kamil Ciosek

A Bayesian Approach To Analysing Training Data Attribution In Deep Learning

Training data attribution (TDA) techniques find influential training data for the model's prediction on the test data of interest. They approximate the impact of down- or up-weighting a particular training sample. While conceptually useful,…

Machine Learning · Computer Science 2023-11-01 Elisa Nguyen , Minjoon Seo , Seong Joon Oh

Nonparametric Data Attribution for Diffusion Models

Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs. Existing methods for diffusion models typically require access to model gradients or retraining, limiting their…

Machine Learning · Computer Science 2025-10-17 Yutian Zhao , Chao Du , Xiaosen Zheng , Tianyu Pang , Min Lin

Sparse, Efficient and Explainable Data Attribution with DualXDA

Data Attribution (DA) is an emerging approach in the field of eXplainable Artificial Intelligence (XAI), aiming to identify influential training datapoints which determine model outputs. It seeks to provide transparency about the model and…

Machine Learning · Computer Science 2025-12-22 Galip Ümit Yolcu , Moritz Weckbecker , Thomas Wiegand , Wojciech Samek , Sebastian Lapuschkin

Better Training Data Attribution via Better Inverse Hessian-Vector Products

Training data attribution (TDA) provides insights into which training data is responsible for a learned model behavior. Gradient-based TDA methods such as influence functions and unrolled differentiation both involve a computation that…

Machine Learning · Computer Science 2025-07-22 Andrew Wang , Elisa Nguyen , Runshi Yang , Juhan Bae , Sheila A. McIlraith , Roger Grosse

Scalable Influence and Fact Tracing for Large Language Model Pretraining

Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples, and the application of these methods to large language model (LLM) outputs could significantly advance model transparency and data…

Computation and Language · Computer Science 2024-12-24 Tyler A. Chang , Dheeraj Rajagopal , Tolga Bolukbasi , Lucas Dixon , Ian Tenney

Generalized Group Data Attribution

Data Attribution (DA) methods quantify the influence of individual training data points on model outputs and have broad applications such as explainability, data selection, and noisy label identification. However, existing DA methods are…

Machine Learning · Computer Science 2024-10-22 Dan Ley , Suraj Srinivas , Shichang Zhang , Gili Rusak , Himabindu Lakkaraju

Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration

The black-box nature of large language models (LLMs) poses challenges in interpreting results, impacting issues such as data intellectual property protection and hallucination tracing. Training data attribution (TDA) methods are considered…

Computation and Language · Computer Science 2024-11-20 Kangxi Wu , Liang Pang , Huawei Shen , Xueqi Cheng

Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs

Training data attribution (TDA) methods offer to trace a model's prediction on any given example back to specific influential training examples. Existing approaches do so by assigning a scalar influence score to each training example, under…

Machine Learning · Computer Science 2023-03-15 Kelvin Guu , Albert Webson , Ellie Pavlick , Lucas Dixon , Ian Tenney , Tolga Bolukbasi

Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods

Training data attribution (TDA) is concerned with understanding model behavior in terms of the training data. This paper draws attention to the common setting where one has access only to the final trained model, and not the training…

Machine Learning · Computer Science 2025-11-25 Dennis Wei , Inkit Padhi , Soumya Ghosh , Amit Dhurandhar , Karthikeyan Natesan Ramamurthy , Maria Chang

Large-Scale Training Data Attribution for Music Generative Models via Unlearning

This paper explores the use of unlearning methods for training data attribution (TDA) in music generative models trained on large-scale datasets. TDA aims to identify which specific training data points contributed the most to the…

Sound · Computer Science 2025-10-08 Woosung Choi , Junghyun Koo , Kin Wai Cheuk , Joan Serrà , Marco A. Martínez-Ramírez , Yukara Ikemiya , Naoki Murata , Yuhta Takida , Wei-Hsiang Liao , Yuki Mitsufuji

Scalable Data Attribution via Forward-Only Test-Time Inference

Data attribution seeks to trace model behavior back to the training examples that shaped it, enabling debugging, auditing, and data valuation at scale. Classical influence-function methods offer a principled foundation but remain…

Machine Learning · Computer Science 2025-11-26 Sibo Ma , Julian Nyarko

TabRep: Training Tabular Diffusion Models with a Simple and Effective Continuous Representation

Diffusion models have been the predominant generative model for tabular data generation. However, they face the conundrum of modeling under a separate versus a unified data representation. The former encounters the challenge of jointly…

Machine Learning · Computer Science 2025-12-23 Jacob Si , Zijing Ou , Mike Qu , Zhengrui Xiang , Yingzhen Li

Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond

In recent years, training data attribution (TDA) methods have emerged as a promising direction for the interpretability of neural networks. While research around TDA is thriving, limited effort has been dedicated to the evaluation of…

Machine Learning · Computer Science 2024-10-11 Dilyara Bareeva , Galip Ümit Yolcu , Anna Hedström , Niklas Schmolenski , Thomas Wiegand , Wojciech Samek , Sebastian Lapuschkin