English
Related papers

Related papers: Accelerated Shapley Value Approximation for Data E…

200 papers

The value and copyright of training data are crucial in the artificial intelligence industry. Service platforms should protect data providers' legitimate rights and fairly reward them for their contributions. Shapley value, a potent tool…

Machine Learning · Computer Science 2025-11-21 Haifeng Sun , Yu Xiong , Runze Wu , Xinyu Cai , Changjie Fan , Lan Zhang , Xiang-Yang Li

As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been…

Machine Learning · Statistics 2019-06-11 Amirata Ghorbani , James Zou

We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a…

Artificial Intelligence · Computer Science 2025-02-25 Felipe Garrido-Lucero , Benjamin Heymann , Maxime Vono , Patrick Loiseau , Vianney Perchet

"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining…

Machine Learning · Computer Science 2023-03-07 Ruoxi Jia , David Dao , Boxin Wang , Frances Ann Hubis , Nick Hynes , Nezihe Merve Gurel , Bo Li , Ce Zhang , Dawn Song , Costas Spanos

Data Shapley has recently been proposed as a principled framework to quantify the contribution of individual datum in machine learning. It can effectively identify helpful or harmful data points for a learning algorithm. In this paper, we…

Machine Learning · Computer Science 2022-01-20 Yongchan Kwon , James Zou

Data is a critical asset for training large language models (LLMs), alongside compute resources and skilled workers. While some training data is publicly available, substantial investment is required to generate proprietary datasets, such…

Machine Learning · Computer Science 2026-01-27 Mélissa Tamine , Otmane Sakhi , Benjamin Heymann

The Shapley value has been proposed as a solution to many applications in machine learning, including for equitable valuation of data. Shapley values are computationally expensive and involve the entire dataset. The query for a point's…

Machine Learning · Computer Science 2022-06-02 Lauren Watson , Rayna Andreeva , Hao-Tsung Yang , Rik Sarkar

Data valuation has garnered increasing attention in recent years, given the critical role of high-quality data in various applications. Among diverse data valuation approaches, Shapley value-based methods are predominant due to their strong…

Machine Learning · Computer Science 2025-11-27 Xiaoling Zhou , Ou Wu , Michael K. Ng , Hao Jiang

Data valuation has become an increasingly significant discipline in data science due to the economic value of data. In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point…

Machine Learning · Computer Science 2023-06-13 Xiang Li , Haocheng Xia , Jinfei Liu

As data emerges as a vital driver of technological and economic advancements, a key challenge is accurately quantifying its value in algorithmic decision-making. The Shapley value, a well-established concept from cooperative game theory,…

Computer Science and Game Theory · Computer Science 2025-11-20 Xi Zheng , Xiangyu Chang , Ruoxi Jia , Yong Tan

Shapley value is a classic notion from game theory, historically used to quantify the contributions of individuals within groups, and more recently applied to assign values to data points when training machine learning models. Despite its…

Machine Learning · Computer Science 2020-02-28 Amirata Ghorbani , Michael P. Kim , James Zou

Distributional data Shapley value (DShapley) has recently been proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values…

Machine Learning · Statistics 2021-02-19 Yongchan Kwon , Manuel A. Rivas , James Zou

Data valuation using Shapley value has emerged as a prevalent research domain in machine learning applications. However, it is a challenge to address the role of order in data cooperation as most research lacks such discussion. To tackle…

Machine Learning · Computer Science 2023-05-04 Jie Liu , Peizheng Wang , Chao Wu

The proliferation of large models has intensified the need for efficient data valuation methods to quantify the contribution of individual data providers. Traditional approaches, such as game-theory-based Shapley value and…

Artificial Intelligence · Computer Science 2025-09-24 Le Ma , Shirao Yang , Zihao Wang , Yinggui Wang , Lei Wang , Tao Wei , Kejun Zhang

Fair credit assignment is essential in various machine learning (ML) applications, and Shapley values have emerged as a valuable tool for this purpose. However, in critical ML applications such as data valuation and feature attribution, the…

Machine Learning · Computer Science 2025-03-11 Pranoy Panda , Siddharth Tandon , Vineeth N Balasubramanian

Data valuation, or the valuation of individual datum contributions, has seen growing interest in machine learning due to its demonstrable efficacy for tasks such as noisy label detection. In particular, due to the desirable axiomatic…

Machine Learning · Computer Science 2022-11-15 Stephanie Schoch , Haifeng Xu , Yangfeng Ji

Measuring the value of individual samples is critical for many data-driven tasks, e.g., the training of a deep learning model. Recent literature witnesses the substantial efforts in developing data valuation methods. The primary data…

Machine Learning · Computer Science 2024-06-06 Ou Wu , Weiyao Zhu , Mengyang Li

Although Shapley values have been shown to be highly effective for identifying harmful training instances, dataset size and model complexity constraints limit the ability to apply Shapley-based data valuation to fine-tuning large…

Computation and Language · Computer Science 2023-06-21 Stephanie Schoch , Ritwick Mishra , Yangfeng Ji

Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection…

Machine Learning · Computer Science 2024-05-08 Jiachen T. Wang , Tianji Yang , James Zou , Yongchan Kwon , Ruoxi Jia

Data valuation, especially quantifying data value in algorithmic prediction and decision-making, is a fundamental problem in data trading scenarios. The most widely used method is to define the data Shapley and approximate it by means of…

Machine Learning · Statistics 2023-05-23 Mengmeng Wu , Ruoxi Jia , Changle Lin , Wei Huang , Xiangyu Chang
‹ Prev 1 2 3 10 Next ›