Related papers: Accelerating Machine Learning Algorithms with Adap…

Active sampling: A machine-learning-assisted framework for finite population inference with optimal subsamples

Data subsampling has become widely recognized as a tool to overcome computational and economic bottlenecks in analyzing massive datasets. We contribute to the development of adaptive design for estimation of finite population…

Methodology · Statistics 2024-07-08 Henrik Imberg , Xiaomi Yang , Carol Flannagan , Jonas Bärgman

Column Selection via Adaptive Sampling

Selecting a good column (or row) subset of massive data matrices has found many applications in data analysis and machine learning. We propose a new adaptive sampling algorithm that can be used to improve any relative-error column selection…

Data Structures and Algorithms · Computer Science 2015-10-15 Saurabh Paul , Malik Magdon-Ismail , Petros Drineas

Efficient Augmentation via Data Subsampling

Data augmentation is commonly used to encode invariances in learning methods. However, this process is often performed in an inefficient manner, as artificial examples are created by applying a number of transformations to all points in the…

Machine Learning · Computer Science 2019-03-04 Michael Kuchnik , Virginia Smith

Batch mode active learning for efficient parameter estimation

For many tasks of data analysis, we may only have the information of the explanatory variable and the evaluation of the response values are quite expensive. While it is impractical or too costly to obtain the responses of all units, a…

Computation · Statistics 2023-04-07 Wei Zheng , Ting Tian , Xueqin Wang

Adaptive Optimization Algorithms for Machine Learning

Machine learning assumes a pivotal role in our data-driven world. The increasing scale of models and datasets necessitates quick and reliable algorithms for model training. This dissertation investigates adaptivity in machine learning…

Machine Learning · Computer Science 2023-11-20 Slavomír Hanzely

Model-specific Data Subsampling with Influence Functions

Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the…

Machine Learning · Computer Science 2020-10-21 Anant Raj , Cameron Musco , Lester Mackey , Nicolo Fusi

Starting Small -- Learning with Adaptive Sample Sizes

For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set. In this context, we investigate strategies for dynamically increasing the effective sample size, when…

Machine Learning · Computer Science 2016-10-10 Hadi Daneshmand , Aurelien Lucchi , Thomas Hofmann

Subsampling for Big Data Linear Models with Measurement Errors

Subsampling algorithms for various parametric regression models with massive data have been extensively investigated in recent years. However, all existing studies on subsampling heavily rely on clean massive data. In practical…

Statistics Theory · Mathematics 2025-06-11 Jiangshan Ju , Mingqiu Wang , Shengli Zhao

Accelerating Deep Learning with Dynamic Data Pruning

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

Towards Optimizing the Expected Performance of Sampling-Based Quantum-Inspired Algorithms

Quantum-inspired classical algorithms has received much attention due to its exponential speedup compared to existing algorithms, under certain data storage assumptions. The improvements are noticeable in fundamental linear algebra tasks.…

Quantum Physics · Physics 2025-12-08 Hyunho Cha , Jungwoo Lee

Intelligent sampling for surrogate modeling, hyperparameter optimization, and data analysis

Sampling techniques are used in many fields, including design of experiments, image processing, and graphics. The techniques in each field are designed to meet the constraints specific to that field such as uniform coverage of the range of…

Machine Learning · Computer Science 2023-06-08 Chandrika Kamath

Optimal Sampling Gaps for Adaptive Submodular Maximization

Running machine learning algorithms on large and rapidly growing volumes of data is often computationally expensive, one common trick to reduce the size of a data set, and thus reduce the computational cost of machine learning algorithms,…

Machine Learning · Computer Science 2022-01-25 Shaojie Tang , Jing Yuan

Strongly Adaptive Online Learning

Strongly adaptive algorithms are algorithms whose performance on every time interval is close to optimal. We present a reduction that can transform standard low-regret algorithms to strongly adaptive. As a consequence, we derive simple, yet…

Machine Learning · Computer Science 2015-06-22 Amit Daniely , Alon Gonen , Shai Shalev-Shwartz

Optimal subsampling algorithm for the marginal model with large longitudinal data

Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the…

Methodology · Statistics 2023-11-16 Haohui Han , Liya Fu

A Review of Meta-level Learning in the Context of Multi-component, Multi-level Evolving Prediction Systems

The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive…

Machine Learning · Computer Science 2020-07-22 Abbas Raza Ali , Marcin Budka , Bogdan Gabrys

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…

Optimization and Control · Mathematics 2020-08-28 Filip Hanzely

Subsampling Algorithms for Semidefinite Programming

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls…

Optimization and Control · Mathematics 2011-08-30 Alexandre d'Aspremont

Adaptive Scheduling for Multi-Task Learning

To train neural machine translation models simultaneously on multiple tasks (languages), it is common to sample each task uniformly or in proportion to dataset sizes. As these methods offer little control over performance trade-offs, we…

Machine Learning · Computer Science 2019-09-17 Sébastien Jean , Orhan Firat , Melvin Johnson

Incremental Sampling Without Replacement for Sequence Models

Sampling is a fundamental technique, and sampling without replacement is often desirable when duplicate samples are not beneficial. Within machine learning, sampling is useful for generating diverse outputs from a trained model. We present…

Machine Learning · Computer Science 2021-07-21 Kensen Shi , David Bieber , Charles Sutton

Human-like machine learning: limitations and suggestions

This paper attempts to address the issues of machine learning in its current implementation. It is known that machine learning algorithms require a significant amount of data for training purposes, whereas recent developments in deep…

Machine Learning · Computer Science 2018-11-16 Georgios Mastorakis