English
Related papers

Related papers: Enhancing Training Data Attribution with Represent…

200 papers

Training data attribution (TDA) methods aim to identify which training examples influence a model's predictions on specific test data most. By quantifying these influences, TDA supports critical applications such as data debugging,…

Machine Learning · Computer Science 2025-05-30 Xingyuan Pan , Chenlu Ye , Joseph Melkonian , Jiaqi W. Ma , Tong Zhang

Training data attribution (TDA) plays a critical role in understanding the influence of individual training data points on model predictions. Gradient-based TDA methods, popularized by \textit{influence function} for their superior…

Machine Learning · Computer Science 2025-09-17 Shiyuan Zhang , Junwei Deng , Juhan Bae , Jiaqi Ma

Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be…

Machine Learning · Computer Science 2024-05-22 Juhan Bae , Wu Lin , Jonathan Lorraine , Roger Grosse

As large language models are increasingly trained and fine-tuned, practitioners need methods to identify which training data drive specific behaviors, particularly unintended ones. Training Data Attribution (TDA) methods address this by…

Training data attribution (TDA) methods aim to quantify the influence of individual training data points on the model predictions, with broad applications in data-centric AI, such as mislabel detection, data selection, and copyright…

Machine Learning · Computer Science 2024-05-28 Junwei Deng , Ting-Wei Li , Shichang Zhang , Jiaqi Ma

Training data attribution (TDA) identifies which training examples most influenced a model's prediction. Influence function methods are a theoretically grounded family of TDA methods and exploit gradients. To overcome the scalability…

Machine Learning · Computer Science 2026-05-15 Shuangqi Li , Hieu Le , Jingyi Xu , Mathieu Salzmann

Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the…

Machine Learning · Computer Science 2026-04-10 Dharmesh Tailor , Nicolò Felicioni , Kamil Ciosek

Training data attribution (TDA) techniques find influential training data for the model's prediction on the test data of interest. They approximate the impact of down- or up-weighting a particular training sample. While conceptually useful,…

Machine Learning · Computer Science 2023-11-01 Elisa Nguyen , Minjoon Seo , Seong Joon Oh

Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs. Existing methods for diffusion models typically require access to model gradients or retraining, limiting their…

Machine Learning · Computer Science 2025-10-17 Yutian Zhao , Chao Du , Xiaosen Zheng , Tianyu Pang , Min Lin

Data Attribution (DA) is an emerging approach in the field of eXplainable Artificial Intelligence (XAI), aiming to identify influential training datapoints which determine model outputs. It seeks to provide transparency about the model and…

Machine Learning · Computer Science 2025-12-22 Galip Ümit Yolcu , Moritz Weckbecker , Thomas Wiegand , Wojciech Samek , Sebastian Lapuschkin

Training data attribution (TDA) provides insights into which training data is responsible for a learned model behavior. Gradient-based TDA methods such as influence functions and unrolled differentiation both involve a computation that…

Machine Learning · Computer Science 2025-07-22 Andrew Wang , Elisa Nguyen , Runshi Yang , Juhan Bae , Sheila A. McIlraith , Roger Grosse

Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples, and the application of these methods to large language model (LLM) outputs could significantly advance model transparency and data…

Computation and Language · Computer Science 2024-12-24 Tyler A. Chang , Dheeraj Rajagopal , Tolga Bolukbasi , Lucas Dixon , Ian Tenney

Data Attribution (DA) methods quantify the influence of individual training data points on model outputs and have broad applications such as explainability, data selection, and noisy label identification. However, existing DA methods are…

Machine Learning · Computer Science 2024-10-22 Dan Ley , Suraj Srinivas , Shichang Zhang , Gili Rusak , Himabindu Lakkaraju

The black-box nature of large language models (LLMs) poses challenges in interpreting results, impacting issues such as data intellectual property protection and hallucination tracing. Training data attribution (TDA) methods are considered…

Computation and Language · Computer Science 2024-11-20 Kangxi Wu , Liang Pang , Huawei Shen , Xueqi Cheng

Training data attribution (TDA) methods offer to trace a model's prediction on any given example back to specific influential training examples. Existing approaches do so by assigning a scalar influence score to each training example, under…

Machine Learning · Computer Science 2023-03-15 Kelvin Guu , Albert Webson , Ellie Pavlick , Lucas Dixon , Ian Tenney , Tolga Bolukbasi

Training data attribution (TDA) is concerned with understanding model behavior in terms of the training data. This paper draws attention to the common setting where one has access only to the final trained model, and not the training…

Machine Learning · Computer Science 2025-11-25 Dennis Wei , Inkit Padhi , Soumya Ghosh , Amit Dhurandhar , Karthikeyan Natesan Ramamurthy , Maria Chang

This paper explores the use of unlearning methods for training data attribution (TDA) in music generative models trained on large-scale datasets. TDA aims to identify which specific training data points contributed the most to the…

Data attribution seeks to trace model behavior back to the training examples that shaped it, enabling debugging, auditing, and data valuation at scale. Classical influence-function methods offer a principled foundation but remain…

Machine Learning · Computer Science 2025-11-26 Sibo Ma , Julian Nyarko

Diffusion models have been the predominant generative model for tabular data generation. However, they face the conundrum of modeling under a separate versus a unified data representation. The former encounters the challenge of jointly…

Machine Learning · Computer Science 2025-12-23 Jacob Si , Zijing Ou , Mike Qu , Zhengrui Xiang , Yingzhen Li

In recent years, training data attribution (TDA) methods have emerged as a promising direction for the interpretability of neural networks. While research around TDA is thriving, limited effort has been dedicated to the evaluation of…

‹ Prev 1 2 3 10 Next ›