Related papers: Exploring Learning Complexity for Efficient Downst…

Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction Uncertainty

Recent advances in deep learning rely heavily on massive datasets, leading to substantial storage and training costs. Dataset pruning aims to alleviate this demand by discarding redundant examples. However, many existing methods require…

Machine Learning · Computer Science 2025-06-13 Yeseul Cho , Baekrok Shin , Changmin Kang , Chulhee Yun

Large-scale Dataset Pruning with Dynamic Uncertainty

The state of the art of many learning tasks, e.g., image classification, is advanced by collecting larger datasets and then training larger models on them. As the outcome, the increasing computational cost is becoming unaffordable. In this…

Machine Learning · Computer Science 2024-06-17 Muyang He , Shuo Yang , Tiejun Huang , Bo Zhao

Accelerating Deep Learning with Dynamic Data Pruning

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning

Massive data is often considered essential for deep learning applications, but it also incurs significant computational and infrastructural costs. Therefore, dataset pruning (DP) has emerged as an effective way to improve data efficiency by…

Machine Learning · Computer Science 2023-11-21 Yihua Zhang , Yimeng Zhang , Aochuan Chen , Jinghan Jia , Jiancheng Liu , Gaowen Liu , Mingyi Hong , Shiyu Chang , Sijia Liu

Label-Efficient Dataset Pruning via Semi-Supervised Pseudo-Labeling

Dataset pruning reduces the storage and training costs of deep learning by selecting an informative subset from a large dataset. However, most existing pruning methods require fully labeled data, which limits their applicability in…

Machine Learning · Computer Science 2026-05-25 Yeseul Cho , Baekrok Shin , Changmin Kang , Chulhee Yun

Learnable Sparsity for Vision Generative Models

Diffusion models have achieved impressive advancements in various vision tasks. However, these gains often rely on increasing model size, which escalates computational complexity and memory demands, complicating deployment, raising…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Yang Zhang , Er Jin , Wenzhong Liang , Yanfei Dong , Ashkan Khakzar , Philip Torr , Johannes Stegmaier , Kenji Kawaguchi

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Large volumes of text data have contributed significantly to the development of large language models (LLMs) in recent years. This data is typically acquired by scraping the internet, leading to pretraining datasets comprised of noisy web…

Computation and Language · Computer Science 2023-09-12 Max Marion , Ahmet Üstün , Luiza Pozzobon , Alex Wang , Marzieh Fadaee , Sara Hooker

Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning

The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the…

Machine Learning · Computer Science 2021-03-15 Seul-Ki Yeom , Philipp Seegerer , Sebastian Lapuschkin , Alexander Binder , Simon Wiedemann , Klaus-Robert Müller , Wojciech Samek

Systematic Characterization of Minimal Deep Learning Architectures: A Unified Analysis of Convergence, Pruning, and Quantization

Deep learning networks excel at classification, yet identifying minimal architectures that reliably solve a task remains challenging. We present a computational methodology for systematically exploring and analyzing the relationships among…

Machine Learning · Computer Science 2026-01-27 Ziwei Zheng , Huizhi Liang , Vaclav Snasel , Vito Latora , Panos Pardalos , Giuseppe Nicosia , Varun Ojha

Dataset Pruning: Reducing Training Data by Examining Generalization Influence

The great success of deep learning heavily relies on increasingly larger training data, which comes at a price of huge computational and infrastructural costs. This poses crucial questions that, do all training data contribute to model's…

Machine Learning · Computer Science 2023-02-28 Shuo Yang , Zeke Xie , Hanyu Peng , Min Xu , Mingming Sun , Ping Li

D2 Pruning: Message Passing for Balancing Diversity and Difficulty in Data Pruning

Analytical theories suggest that higher-quality data can lead to lower test errors in models trained on a fixed data budget. Moreover, a model can be trained on a lower compute budget without compromising performance if a dataset can be…

Machine Learning · Computer Science 2023-10-13 Adyasha Maharana , Prateek Yadav , Mohit Bansal

Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Despite the remarkable generation capabilities of Diffusion Models (DMs), conducting training and inference remains computationally expensive. Previous works have been devoted to accelerating diffusion sampling, but achieving data-efficient…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Yize Li , Yihua Zhang , Sijia Liu , Xue Lin

Teacher-Guided One-Shot Pruning via Context-Aware Knowledge Distillation

Unstructured pruning remains a powerful strategy for compressing deep neural networks, yet it often demands iterative train-prune-retrain cycles, resulting in significant computational overhead. To address this challenge, we introduce a…

Computer Vision and Pattern Recognition · Computer Science 2025-11-21 Md. Samiul Alim , Sharjil Khan , Amrijit Biswas , Fuad Rahman , Shafin Rahman , Nabeel Mohammed

Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems

Deep learning recommendation systems at scale have provided remarkable gains through increasing model capacity (i.e. wider and deeper neural networks), but it comes at significant training cost and infrastructure cost. Model pruning is an…

Information Retrieval · Computer Science 2021-05-05 Xiaocong Du , Bhargav Bhushanam , Jiecao Yu , Dhruv Choudhary , Tianxiang Gao , Sherman Wong , Louis Feng , Jongsoo Park , Yu Cao , Arun Kejariwal

CLIP: Train Faster with Less Data

Deep learning models require an enormous amount of data for training. However, recently there is a shift in machine learning from model-centric to data-centric approaches. In data-centric approaches, the focus is to refine and improve the…

Computer Vision and Pattern Recognition · Computer Science 2023-08-08 Muhammad Asif Khan , Ridha Hamila , Hamid Menouar

A Study in Dataset Pruning for Image Super-Resolution

In image Super-Resolution (SR), relying on large datasets for training is a double-edged sword. While offering rich training material, they also demand substantial computational and storage resources. In this work, we analyze dataset…

Image and Video Processing · Electrical Eng. & Systems 2024-06-11 Brian B. Moser , Federico Raue , Andreas Dengel

Effective pruning of web-scale datasets based on complexity of concept clusters

Utilizing massive web-scale datasets has led to unprecedented performance gains in machine learning models, but also imposes outlandish compute requirements for their training. In order to improve training and data efficiency, we here push…

Computer Vision and Pattern Recognition · Computer Science 2024-03-13 Amro Abbas , Evgenia Rusak , Kushal Tirumala , Wieland Brendel , Kamalika Chaudhuri , Ari S. Morcos

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance.…

Machine Learning · Computer Science 2025-07-18 Suorong Yang , Peijia Li , Yujie Liu , Zhiming Xu , Peng Ye , Wanli Ouyang , Furao Shen , Dongzhan Zhou

ClassPruning: Speed Up Image Restoration Networks by Dynamic N:M Pruning

Image restoration tasks have achieved tremendous performance improvements with the rapid advancement of deep neural networks. However, most prevalent deep learning models perform inference statically, ignoring that different images have…

Computer Vision and Pattern Recognition · Computer Science 2022-11-11 Yang Zhou , Yuda Song , Hui Qian , Xin Du

Class-Proportional Coreset Selection for Difficulty-Separable Data

High-quality training data is essential for building reliable and efficient machine learning systems. One-shot coreset selection addresses this by pruning the dataset while maintaining or even improving model performance, often relying on…

Machine Learning · Computer Science 2025-08-15 Elisa Tsai , Haizhong Zheng , Atul Prakash