English
Related papers

Related papers: DUPRE: Data Utility Prediction for Efficient Data …

200 papers

Data valuation has become an increasingly significant discipline in data science due to the economic value of data. In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point…

Machine Learning · Computer Science 2023-06-13 Xiang Li , Haocheng Xia , Jinfei Liu

As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been…

Machine Learning · Statistics 2019-06-11 Amirata Ghorbani , James Zou

Data valuation has found various applications in machine learning, such as data filtering, efficient learning and incentives for data sharing. The most popular current approach to data valuation is the Shapley value. While popular for its…

Machine Learning · Computer Science 2023-11-10 Lauren Watson , Zeno Kujawa , Rayna Andreeva , Hao-Tsung Yang , Tariq Elahi , Rik Sarkar

Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection…

Machine Learning · Computer Science 2024-05-08 Jiachen T. Wang , Tianji Yang , James Zou , Yongchan Kwon , Ruoxi Jia

Data is a critical asset for training large language models (LLMs), alongside compute resources and skilled workers. While some training data is publicly available, substantial investment is required to generate proprietary datasets, such…

Machine Learning · Computer Science 2026-01-27 Mélissa Tamine , Otmane Sakhi , Benjamin Heymann

We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a…

Artificial Intelligence · Computer Science 2025-02-25 Felipe Garrido-Lucero , Benjamin Heymann , Maxime Vono , Patrick Loiseau , Vianney Perchet

Data valuation is an essential task in a data marketplace. It aims at fairly compensating data owners for their contribution. There is increasing recognition in the machine learning community that the Shapley value -- a foundational…

Cryptography and Security · Computer Science 2023-02-20 Zhihua Tian , Jian Liu , Jingyu Li , Xinle Cao , Ruoxi Jia , Jun Kong , Mengdi Liu , Kui Ren

Distributional data Shapley value (DShapley) has recently been proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values…

Machine Learning · Statistics 2021-02-19 Yongchan Kwon , Manuel A. Rivas , James Zou

"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining…

Machine Learning · Computer Science 2023-03-07 Ruoxi Jia , David Dao , Boxin Wang , Frances Ann Hubis , Nick Hynes , Nezihe Merve Gurel , Bo Li , Ce Zhang , Dawn Song , Costas Spanos

Understanding the decision-making process of machine learning models is crucial for ensuring trustworthy machine learning. Data Shapley, a landmark study on data valuation, advances this understanding by assessing the contribution of each…

Computer Science and Game Theory · Computer Science 2025-01-23 Huaiguang Cai

As data emerges as a vital driver of technological and economic advancements, a key challenge is accurately quantifying its value in algorithmic decision-making. The Shapley value, a well-established concept from cooperative game theory,…

Computer Science and Game Theory · Computer Science 2025-11-20 Xi Zheng , Xiangyu Chang , Ruoxi Jia , Yong Tan

The proliferation of large models has intensified the need for efficient data valuation methods to quantify the contribution of individual data providers. Traditional approaches, such as game-theory-based Shapley value and…

Artificial Intelligence · Computer Science 2025-09-24 Le Ma , Shirao Yang , Zihao Wang , Yinggui Wang , Lei Wang , Tao Wei , Kejun Zhang

The Shapley value (SV) and Least core (LC) are classic methods in cooperative game theory for cost/profit sharing problems. Both methods have recently been proposed as a principled solution for data valuation tasks, i.e., quantifying the…

Machine Learning · Computer Science 2022-04-08 Tianhao Wang , Yu Yang , Ruoxi Jia

Developing modern machine learning (ML) applications is data-centric, of which one fundamental challenge is to understand the influence of data quality to ML training -- "Which training examples are 'guilty' in making the trained ML model…

Machine Learning · Computer Science 2022-04-28 Bojan Karlaš , David Dao , Matteo Interlandi , Bo Li , Sebastian Schelter , Wentao Wu , Ce Zhang

The value and copyright of training data are crucial in the artificial intelligence industry. Service platforms should protect data providers' legitimate rights and fairly reward them for their contributions. Shapley value, a potent tool…

Machine Learning · Computer Science 2025-11-21 Haifeng Sun , Yu Xiong , Runze Wu , Xinyu Cai , Changjie Fan , Lan Zhang , Xiang-Yang Li

Quantifying the value of data within a machine learning workflow can play a pivotal role in making more strategic decisions in machine learning initiatives. The existing Shapley value based frameworks for data valuation in machine learning…

Machine Learning · Computer Science 2024-07-10 Ayush K Tarun , Vikram S Chundawat , Murari Mandal , Hong Ming Tan , Bowei Chen , Mohan Kankanhalli

Shapley value is a classic notion from game theory, historically used to quantify the contributions of individuals within groups, and more recently applied to assign values to data points when training machine learning models. Despite its…

Machine Learning · Computer Science 2020-02-28 Amirata Ghorbani , Michael P. Kim , James Zou

Data selection has emerged as a crucial downstream application of data valuation. While existing data valuation methods have shown promise in selection tasks, the theoretical foundations and full potential of using data values for selection…

Artificial Intelligence · Computer Science 2025-02-10 Hongliang Chi , Qiong Wu , Zhengyi Zhou , Jonathan Light , Emily Dodwell , Yao Ma

Measuring the value of individual samples is critical for many data-driven tasks, e.g., the training of a deep learning model. Recent literature witnesses the substantial efforts in developing data valuation methods. The primary data…

Machine Learning · Computer Science 2024-06-06 Ou Wu , Weiyao Zhu , Mengyang Li

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and…

Machine Learning · Computer Science 2021-04-27 Ruoxi Jia , Fan Wu , Xuehui Sun , Jiacen Xu , David Dao , Bhavya Kailkhura , Ce Zhang , Bo Li , Dawn Song
‹ Prev 1 2 3 10 Next ›