Related papers: PUMA: margin-based data pruning

Accelerating Deep Learning with Dynamic Data Pruning

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

Efficient Adversarial Training With Data Pruning

Neural networks are susceptible to adversarial examples-small input perturbations that cause models to fail. Adversarial training is one of the solutions that stops adversarial examples; models are exposed to attacks during training and…

Machine Learning · Computer Science 2022-07-05 Maximilian Kaufmann , Yiren Zhao , Ilia Shumailov , Robert Mullins , Nicolas Papernot

DRoP: Distributionally Robust Data Pruning

In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by removing redundant or uninformative samples from the…

Machine Learning · Computer Science 2025-02-11 Artem Vysogorets , Kartik Ahuja , Julia Kempe

PUMA: Performance Unchanged Model Augmentation for Training Data Removal

Preserving the performance of a trained model while removing unique characteristics of marked training data points is challenging. Recent research usually suggests retraining a model from scratch with remaining training data or refining the…

Machine Learning · Statistics 2022-03-03 Ga Wu , Masoud Hashemi , Christopher Srinivasa

Improved Methods for Model Pruning and Knowledge Distillation

Model pruning is a performance optimization technique for large language models like R1 or o3-mini. However, existing pruning methods often lead to significant performance degradation or require extensive retraining and fine-tuning. This…

Computation and Language · Computer Science 2025-05-21 Wei Jiang , Anying Fu , Youling Zhang

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance.…

Machine Learning · Computer Science 2025-07-18 Suorong Yang , Peijia Li , Yujie Liu , Zhiming Xu , Peng Ye , Wanli Ouyang , Furao Shen , Dongzhan Zhou

Supervised Robustness-preserving Data-free Neural Network Pruning

When deploying pre-trained neural network models in real-world applications, model consumers often encounter resource-constraint platforms such as mobile and smart devices. They typically use the pruning technique to reduce the size and…

Machine Learning · Computer Science 2025-06-19 Mark Huasong Meng , Guangdong Bai , Sin Gee Teo , Jin Song Dong

Beta-Rank: A Robust Convolutional Filter Pruning Method For Imbalanced Medical Image Analysis

As deep neural networks include a high number of parameters and operations, it can be a challenge to implement these models on devices with limited computational resources. Despite the development of novel pruning methods toward…

Computer Vision and Pattern Recognition · Computer Science 2023-06-27 Morteza Homayounfar , Mohamad Koohi-Moghadam , Reza Rawassizadeh , Varut Vardhanabhuti

Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation

Their vulnerability to small, imperceptible attacks limits the adoption of deep learning models to real-world systems. Adversarial training has proven to be one of the most promising strategies against these attacks, at the expense of a…

Machine Learning · Computer Science 2024-07-12 Björn Nieth , Thomas Altstidl , Leo Schwinn , Björn Eskofier

HYDRA: Pruning Adversarially Robust Neural Networks

In safety-critical but computationally resource-constrained applications, deep learning faces two key challenges: lack of robustness against adversarial attacks and large neural network size (often millions of parameters). While the…

Computer Vision and Pattern Recognition · Computer Science 2020-11-11 Vikash Sehwag , Shiqi Wang , Prateek Mittal , Suman Jana

Increasing-Margin Adversarial (IMA) Training to Improve Adversarial Robustness of Neural Networks

Deep neural networks (DNNs) are vulnerable to adversarial noises. Adversarial training is a general and effective strategy to improve DNN robustness (i.e., accuracy on noisy data) against adversarial noises. However, DNN models trained by…

Computer Vision and Pattern Recognition · Computer Science 2023-02-13 Linhai Ma , Liang Liang

Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning

Massive data is often considered essential for deep learning applications, but it also incurs significant computational and infrastructural costs. Therefore, dataset pruning (DP) has emerged as an effective way to improve data efficiency by…

Machine Learning · Computer Science 2023-11-21 Yihua Zhang , Yimeng Zhang , Aochuan Chen , Jinghan Jia , Jiancheng Liu , Gaowen Liu , Mingyi Hong , Shiyu Chang , Sijia Liu

Towards Certified Robustness of Distance Metric Learning

Metric learning aims to learn a distance metric such that semantically similar instances are pulled together while dissimilar instances are pushed away. Many existing methods consider maximizing or at least constraining a distance margin in…

Machine Learning · Statistics 2022-08-17 Xiaochen Yang , Yiwen Guo , Mingzhi Dong , Jing-Hao Xue

Pruning-based Data Selection and Network Fusion for Efficient Deep Learning

Efficient data selection is essential for improving the training efficiency of deep neural networks and reducing the associated annotation costs. However, traditional methods tend to be computationally expensive, limiting their scalability…

Machine Learning · Computer Science 2025-01-03 Humaira Kousar , Hasnain Irshad Bhatti , Jaekyun Moon

LoRA Unlearns More and Retains More (Student Abstract)

Due to increasing privacy regulations and regulatory compliance, Machine Unlearning (MU) has become essential. The goal of unlearning is to remove information related to a specific class from a model. Traditional approaches achieve exact…

Machine Learning · Computer Science 2024-11-20 Atharv Mittal

Dataset Pruning: Reducing Training Data by Examining Generalization Influence

The great success of deep learning heavily relies on increasingly larger training data, which comes at a price of huge computational and infrastructural costs. This poses crucial questions that, do all training data contribute to model's…

Machine Learning · Computer Science 2023-02-28 Shuo Yang , Zeke Xie , Hanyu Peng , Min Xu , Mingming Sun , Ping Li

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

A central goal in deep learning is to learn compact representations of features at every layer of a neural network, which is useful for both unsupervised representation learning and structured network pruning. While there is a growing body…

Machine Learning · Computer Science 2021-10-05 Jie Bu , Arka Daw , M. Maruf , Anuj Karpatne

Not All Data Matters: An End-to-End Adaptive Dataset Pruning Framework for Enhancing Model Performance and Efficiency

While deep neural networks have demonstrated remarkable performance across various tasks, they typically require massive training data. Due to the presence of redundancies and biases in real-world datasets, not all data in the training…

Artificial Intelligence · Computer Science 2023-12-12 Suorong Yang , Hongchao Yang , Suhan Guo , Furao Shen , Jian Zhao

DiffProb: Data Pruning for Face Recognition

Face recognition models have made substantial progress due to advances in deep learning and the availability of large-scale datasets. However, reliance on massive annotated datasets introduces challenges related to training computational…

Computer Vision and Pattern Recognition · Computer Science 2025-05-22 Eduarda Caldeira , Jan Niklas Kolf , Naser Damer , Fadi Boutros

Towards Compact and Robust Deep Neural Networks

Deep neural networks have achieved impressive performance in many applications but their large number of parameters lead to significant computational and storage overheads. Several recent works attempt to mitigate these overheads by…

Machine Learning · Computer Science 2019-06-17 Vikash Sehwag , Shiqi Wang , Prateek Mittal , Suman Jana