Related papers: Data Origin Inference in Machine Learning

Inferring Class Label Distribution of Training Data from Classifiers: An Accuracy-Augmented Meta-Classifier Attack

Property inference attacks against machine learning (ML) models aim to infer properties of the training data that are unrelated to the primary task of the model, and have so far been formulated as binary decision problems, i.e., whether or…

Machine Learning · Computer Science 2022-11-09 Raksha Ramakrishna , György Dán

REMIND: Input Loss Landscapes Reveal Residual Memorization in Post-Unlearning LLMs

Machine unlearning aims to remove the influence of specific training data from a model without requiring full retraining. This capability is crucial for ensuring privacy, safety, and regulatory compliance. Therefore, verifying whether a…

Computation and Language · Computer Science 2025-11-07 Liran Cohen , Yaniv Nemcovesky , Avi Mendelson

Dataset Inference: Ownership Resolution in Machine Learning

With increasingly more data and computation involved in their training, machine learning models constitute valuable intellectual property. This has spurred interest in model stealing, which is made more practical by advances in learning…

Machine Learning · Statistics 2021-04-23 Pratyush Maini , Mohammad Yaghini , Nicolas Papernot

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

While being deployed in many critical applications as core components, machine learning (ML) models are vulnerable to various security and privacy attacks. One major privacy attack in this domain is membership inference, where an adversary…

Cryptography and Security · Computer Science 2020-09-11 Yang Zou , Zhikun Zhang , Michael Backes , Yang Zhang

Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as…

Machine Learning · Computer Science 2024-07-09 Nexhi Sula , Abhinav Kumar , Jie Hou , Han Wang , Reza Tourani

Model Provenance via Model DNA

Understanding the life cycle of the machine learning (ML) model is an intriguing area of research (e.g., understanding where the model comes from, how it is trained, and how it is used). This paper focuses on a novel problem within this…

Machine Learning · Computer Science 2024-07-19 Xin Mu , Yu Wang , Yehong Zhang , Jiaqi Zhang , Hui Wang , Yang Xiang , Yue Yu

MAIN: Multihead-Attention Imputation Networks

The problem of missing data, usually absent incurated and competition-standard datasets, is an unfortunate reality for most machine learning models used in industry applications. Recent work has focused on understanding the nature and the…

Machine Learning · Computer Science 2022-01-25 Spyridon Mouselinos , Kyriakos Polymenakos , Antonis Nikitakis , Konstantinos Kyriakopoulos

Membership Inference via Backdooring

Recently issued data privacy regulations like GDPR (General Data Protection Regulation) grant individuals the right to be forgotten. In the context of machine learning, this requires a model to forget about a training data sample if…

Cryptography and Security · Computer Science 2022-06-13 Hongsheng Hu , Zoran Salcic , Gillian Dobbie , Jinjun Chen , Lichao Sun , Xuyun Zhang

When Machine Unlearning Jeopardizes Privacy

The right to be forgotten states that a data owner has the right to erase their data from an entity storing it. In the context of machine learning (ML), the right to be forgotten requires an ML model owner to remove the data owner's data…

Cryptography and Security · Computer Science 2021-09-15 Min Chen , Zhikun Zhang , Tianhao Wang , Michael Backes , Mathias Humbert , Yang Zhang

Machine Unlearning for Causal Inference

Machine learning models play a vital role in making predictions and deriving insights from data and are being increasingly used for causal inference. To preserve user privacy, it is important to enable the model to forget some of its…

Machine Learning · Computer Science 2023-08-29 Vikas Ramachandra , Mohit Sethi

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It's a well founded and intuitive fact that existing bias mitigation strategies…

Machine Learning · Computer Science 2022-10-25 Bhushan Chaudhari , Akash Agarwal , Tanmoy Bhowmik

ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS).…

Cryptography and Security · Computer Science 2018-12-18 Ahmed Salem , Yang Zhang , Mathias Humbert , Pascal Berrang , Mario Fritz , Michael Backes

Data Heterogeneity Modeling for Trustworthy Machine Learning

Data heterogeneity plays a pivotal role in determining the performance of machine learning (ML) systems. Traditional algorithms, which are typically designed to optimize average performance, often overlook the intrinsic diversity within…

Machine Learning · Computer Science 2025-06-03 Jiashuo Liu , Peng Cui

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in…

Cryptography and Security · Computer Science 2021-10-07 Yugeng Liu , Rui Wen , Xinlei He , Ahmed Salem , Zhikun Zhang , Michael Backes , Emiliano De Cristofaro , Mario Fritz , Yang Zhang

Data Noising as Smoothing in Neural Network Language Models

Data noising is an effective technique for regularizing neural network models. While noising is widely adopted in application domains such as vision and speech, commonly used noising primitives have not been developed for discrete…

Machine Learning · Computer Science 2017-03-09 Ziang Xie , Sida I. Wang , Jiwei Li , Daniel Lévy , Aiming Nie , Dan Jurafsky , Andrew Y. Ng

Blackbox Dataset Inference for LLM

Today, the training of large language models (LLMs) can involve personally identifiable information and copyrighted material, incurring dataset misuse. To mitigate the problem of dataset misuse, this paper explores \textit{dataset…

Cryptography and Security · Computer Science 2025-12-09 Ruikai Zhou , Kang Yang , Xun Chen , Wendy Hui Wang , Guanhong Tao , Jun Xu

Robust Machine Learning by Transforming and Augmenting Imperfect Training Data

Machine Learning (ML) is an expressive framework for turning data into computer programs. Across many problem domains -- both in industry and policy settings -- the types of computer programs needed for accurate prediction or optimal…

Machine Learning · Computer Science 2023-12-21 Elliot Creager

Incorporating Experts' Judgment into Machine Learning Models

Machine learning (ML) models have been quite successful in predicting outcomes in many applications. However, in some cases, domain experts might have a judgment about the expected outcome that might conflict with the prediction of ML…

Machine Learning · Computer Science 2023-05-02 Hogun Park , Aly Megahed , Peifeng Yin , Yuya Ong , Pravar Mahajan , Pei Guo

Injective Domain Knowledge in Neural Networks for Transprecision Computing

Machine Learning (ML) models are very effective in many learning tasks, due to the capability to extract meaningful information from large data sets. Nevertheless, there are learning problems that cannot be easily solved relying on pure…

Machine Learning · Computer Science 2021-01-29 Andrea Borghesi , Federico Baldo , Michele Lombardi , Michela Milano

Attribute-to-Delete: Machine Unlearning via Datamodel Matching

Machine unlearning -- efficiently removing the effect of a small "forget set" of training data on a pre-trained machine learning model -- has recently attracted significant research interest. Despite this interest, however, recent work…

Machine Learning · Computer Science 2024-11-13 Kristian Georgiev , Roy Rinberg , Sung Min Park , Shivam Garg , Andrew Ilyas , Aleksander Madry , Seth Neel