English
Related papers

Related papers: Efficient Task-Specific Data Valuation for Nearest…

200 papers

Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation.…

Machine Learning · Statistics 2023-11-28 Jiachen T. Wang , Ruoxi Jia

Data valuation has found various applications in machine learning, such as data filtering, efficient learning and incentives for data sharing. The most popular current approach to data valuation is the Shapley value. While popular for its…

Machine Learning · Computer Science 2023-11-10 Lauren Watson , Zeno Kujawa , Rayna Andreeva , Hao-Tsung Yang , Tariq Elahi , Rik Sarkar

This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with…

Data Structures and Algorithms · Computer Science 2024-01-23 Jiachen T. Wang , Prateek Mittal , Ruoxi Jia

As data emerges as a vital driver of technological and economic advancements, a key challenge is accurately quantifying its value in algorithmic decision-making. The Shapley value, a well-established concept from cooperative game theory,…

Computer Science and Game Theory · Computer Science 2025-11-20 Xi Zheng , Xiangyu Chang , Ruoxi Jia , Yong Tan

"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining…

Machine Learning · Computer Science 2023-03-07 Ruoxi Jia , David Dao , Boxin Wang , Frances Ann Hubis , Nick Hynes , Nezihe Merve Gurel , Bo Li , Ce Zhang , Dawn Song , Costas Spanos

As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been…

Machine Learning · Statistics 2019-06-11 Amirata Ghorbani , James Zou

Data valuation, the task of quantifying the contribution of individual data points to model performance, has emerged as a fundamental challenge in machine learning. Game-theoretic approaches, such as the Banzhaf value, offer principled…

Machine Learning · Computer Science 2026-05-21 Guangyi Zhang , Lutz Oettershagen , Lixu Wang , Aristides Gionis

The Shapley value has been proposed as a solution to many applications in machine learning, including for equitable valuation of data. Shapley values are computationally expensive and involve the entire dataset. The query for a point's…

Machine Learning · Computer Science 2022-06-02 Lauren Watson , Rayna Andreeva , Hao-Tsung Yang , Rik Sarkar

The value and copyright of training data are crucial in the artificial intelligence industry. Service platforms should protect data providers' legitimate rights and fairly reward them for their contributions. Shapley value, a potent tool…

Machine Learning · Computer Science 2025-11-21 Haifeng Sun , Yu Xiong , Runze Wu , Xinyu Cai , Changjie Fan , Lan Zhang , Xiang-Yang Li

Shapley value is a classic notion from game theory, historically used to quantify the contributions of individuals within groups, and more recently applied to assign values to data points when training machine learning models. Despite its…

Machine Learning · Computer Science 2020-02-28 Amirata Ghorbani , Michael P. Kim , James Zou

The problem of explaining the behavior of deep neural networks has recently gained a lot of attention. While several attribution methods have been proposed, most come without strong theoretical foundations, which raises questions about…

Machine Learning · Computer Science 2019-06-24 Marco Ancona , Cengiz Öztireli , Markus Gross

Distributional data Shapley value (DShapley) has recently been proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values…

Machine Learning · Statistics 2021-02-19 Yongchan Kwon , Manuel A. Rivas , James Zou

Data valuation has become an increasingly significant discipline in data science due to the economic value of data. In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point…

Machine Learning · Computer Science 2023-06-13 Xiang Li , Haocheng Xia , Jinfei Liu

The Shapley value provides a principled foundation for data valuation, but exact computation is #P-hard due to the exponential coalition space. Existing accelerations remain global and ignore a structural property of modern predictors: for…

Machine Learning · Computer Science 2026-03-05 Xuan Yang , Hsi-Wen Chen , Ming-Syan Chen , Jian Pei

The K-Nearest Neighbors (KNN) algorithm is widely used for classification and regression; however, it suffers from limitations, including the equal treatment of all samples. We propose Information-Modified KNN (IM-KNN), a novel approach…

Machine Learning · Computer Science 2025-07-11 Mohammad Ali Vahedifar , Azim Akhtarshenas , Mohammad Mohammadi Rafatpanah , Maryam Sabbaghian

Fair credit assignment is essential in various machine learning (ML) applications, and Shapley values have emerged as a valuable tool for this purpose. However, in critical ML applications such as data valuation and feature attribution, the…

Machine Learning · Computer Science 2025-03-11 Pranoy Panda , Siddharth Tandon , Vineeth N Balasubramanian

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and…

Machine Learning · Computer Science 2021-04-27 Ruoxi Jia , Fan Wu , Xuehui Sun , Jiacen Xu , David Dao , Bhavya Kailkhura , Ce Zhang , Bo Li , Dawn Song

Data valuation has garnered increasing attention in recent years, given the critical role of high-quality data in various applications. Among diverse data valuation approaches, Shapley value-based methods are predominant due to their strong…

Machine Learning · Computer Science 2025-11-27 Xiaoling Zhou , Ou Wu , Michael K. Ng , Hao Jiang

Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked…

Machine Learning · Computer Science 2023-11-28 Jiachen T. Wang , Yuqing Zhu , Yu-Xiang Wang , Ruoxi Jia , Prateek Mittal

We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a…

Artificial Intelligence · Computer Science 2025-02-25 Felipe Garrido-Lucero , Benjamin Heymann , Maxime Vono , Patrick Loiseau , Vianney Perchet
‹ Prev 1 2 3 10 Next ›