English
Related papers

Related papers: Selection via Proxy: Efficient Data Selection for …

200 papers

Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs. However, it also brings a heavy computing burden as the amount of training data is proportional to the…

Machine Learning · Computer Science 2023-10-19 Peng Yao , Chao Liao , Jiyuan Jia , Jianchao Tan , Bin Chen , Chengru Song , Di Zhang

Fine-tuning the pre-trained model with active learning holds promise for reducing annotation costs. However, this combination introduces significant computational costs, particularly with the growing scale of pre-trained models. Recent…

Machine Learning · Computer Science 2024-11-19 Ziting Wen , Oscar Pizarro , Stefan Williams

Deep learning models for medical image segmentation are primarily data-driven. Models trained with more data lead to improved performance and generalizability. However, training is a computationally expensive process because multiple…

Image and Video Processing · Electrical Eng. & Systems 2021-07-13 Vishwesh Nath , Dong Yang , Ali Hatamizadeh , Anas A. Abidin , Andriy Myronenko , Holger Roth , Daguang Xu

Deep learning (DL) based diagnostics systems can provide accurate and robust quantitative analysis in digital pathology. These algorithms require large amounts of annotated training data which is impractical in pathology due to the high…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Tahsin Reasat , Asif Sushmit , David S. Smith

One of the biggest bottlenecks in a machine learning workflow is waiting for models to train. Depending on the available computing resources, it can take days to weeks to train a neural network on a large dataset with many classes such as…

Machine Learning · Computer Science 2019-06-13 Sam Shleifer , Eric Prokop

The great success of deep learning heavily relies on increasingly larger training data, which comes at a price of huge computational and infrastructural costs. This poses crucial questions that, do all training data contribute to model's…

Machine Learning · Computer Science 2023-02-28 Shuo Yang , Zeke Xie , Hanyu Peng , Min Xu , Mingming Sun , Ping Li

Vulnerability detection is crucial for identifying security weaknesses in software systems. However, training effective machine learning models for this task is often constrained by the high cost and expertise required for data annotation.…

Cryptography and Security · Computer Science 2025-08-19 Xiang Lan , Tim Menzies , Bowen Xu

Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow. Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples. Despite their appeal, these…

Artificial Intelligence · Computer Science 2024-10-17 Talfan Evans , Shreya Pathak , Hamza Merzic , Jonathan Schwarz , Ryutaro Tanno , Olivier J. Henaff

We study the practical consequences of dataset sampling strategies on the performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or…

Information Retrieval · Computer Science 2021-07-13 Noveen Sachdeva , Carole-Jean Wu , Julian McAuley

Coreset selection, which aims to select a subset of the most informative training samples, is a long-standing learning problem that can benefit many downstream tasks such as data-efficient learning, continual learning, neural architecture…

Machine Learning · Computer Science 2022-06-30 Chengcheng Guo , Bo Zhao , Yanbing Bai

Active learning is of great interest for many practical applications, especially in industry and the physical sciences, where there is a strong need to minimize the number of costly experiments necessary to train predictive models. However,…

Machine Learning · Computer Science 2021-12-23 Maryam Pardakhti , Nila Mandal , Anson W. K. Ma , Qian Yang

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

We present an efficient coreset construction algorithm for large-scale Support Vector Machine (SVM) training in Big Data and streaming applications. A coreset is a small, representative subset of the original data points such that a models…

Machine Learning · Computer Science 2020-02-18 Murad Tukan , Cenk Baykal , Dan Feldman , Daniela Rus

Finding valuable training data points for deep neural networks has been a core research challenge with many applications. In recent years, various techniques for calculating the "value" of individual training datapoints have been proposed…

Machine Learning · Computer Science 2021-04-29 Soumi Das , Arshdeep Singh , Saptarshi Chatterjee , Suparna Bhattacharya , Sourangshu Bhattacharya

Deep Learning requires large amounts of data to train models that work well. In data-deficient settings, performance can be degraded. We investigate which Deep Learning methods benefit training models in a data-deficient setting, by…

Computer Vision and Pattern Recognition · Computer Science 2025-06-11 Robert-Jan Bruintjes , Attila Lengyel , Osman Semih Kayhan , Davide Zambrano , Nergis Tömen , Hadi Jamali-Rad , Jan van Gemert

At its core, this thesis aims to enhance the practicality of deep learning by improving the label and training efficiency of deep learning models. To this end, we investigate data subset selection techniques, specifically active learning…

Machine Learning · Computer Science 2024-03-11 Andreas Kirsch

Convolutional neural networks (CNNs) have been successfully applied to many recognition and learning tasks using a universal recipe; training a deep model on a very large dataset of supervised examples. However, this approach is rather…

Machine Learning · Statistics 2018-06-04 Ozan Sener , Silvio Savarese

Transfer learning has become an essential tool in modern computer vision, allowing practitioners to leverage backbones, pretrained on large datasets, to train successful models from limited annotated data. Choosing the right backbone is…

Computer Vision and Pattern Recognition · Computer Science 2025-08-20 Joris Guerin , Shray Bansal , Amirreza Shaban , Paulo Mann , Harshvardhan Gazula

Among various supervised deep metric learning methods proxy-based approaches have achieved high retrieval accuracies. Proxies, which are class-representative points in an embedding space, receive updates based on proxy-sample similarities…

Computer Vision and Pattern Recognition · Computer Science 2022-11-21 Aoyu Li , Ikuro Sato , Kohta Ishikawa , Rei Kawakami , Rio Yokota

Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry. Their data curation poses the challenges of expensive human labeling, inadequate computing resources and larger experiment turn around…

Computer Vision and Pattern Recognition · Computer Science 2019-01-07 Vishal Kaushal , Rishabh Iyer , Suraj Kothawade , Rohan Mahadev , Khoshrav Doctor , Ganesh Ramakrishnan
‹ Prev 1 2 3 10 Next ›