Related papers: Slow Kill for Big Data Learning

A Stochastic Penalty Model for Convex and Nonconvex Optimization with Big Constraints

The last decade witnessed a rise in the importance of supervised learning applications involving {\em big data} and {\em big models}. Big data refers to situations where the amounts of training data available and needed causes difficulties…

Optimization and Control · Mathematics 2018-11-01 Konstantin Mishchenko , Peter Richtárik

Convex Optimization for Big Data

This article reviews recent advances in convex optimization algorithms for Big Data, which aim to reduce the computational, storage, and communications bottlenecks. We provide an overview of this emerging field, describe contemporary…

Optimization and Control · Mathematics 2014-11-05 Volkan Cevher , Stephen Becker , Mark Schmidt

Starting Small -- Learning with Adaptive Sample Sizes

For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set. In this context, we investigate strategies for dynamically increasing the effective sample size, when…

Machine Learning · Computer Science 2016-10-10 Hadi Daneshmand , Aurelien Lucchi , Thomas Hofmann

Feature Selection with Annealing for Computer Vision and Big Data Learning

Many computer vision and medical imaging problems are faced with learning from large-scale datasets, with millions of observations and features. In this paper we propose a novel efficient learning scheme that tightens a sparsity constraint…

Machine Learning · Statistics 2017-02-07 Adrian Barbu , Yiyuan She , Liangjing Ding , Gary Gramajo

Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases

This work investigates the ``small-vs-large gap'', where repeating on fewer samples can lead to compute saving during training compared to using a larger dataset. This is observed across algorithmic tasks, architectures and optimizers and…

Machine Learning · Computer Science 2026-05-21 Jingwen Liu , Ezra Edelman , Surbhi Goel , Bingbin Liu

Compressing Large Sample Data for Discriminant Analysis

Large-sample data became prevalent as data acquisition became cheaper and easier. While a large sample size has theoretical advantages for many statistical methods, it presents computational challenges. Sketching, or compression, is a…

Machine Learning · Statistics 2020-05-11 Alexander F. Lapanowski , Irina Gaynanova

DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size

Large-scale supervised classification algorithms, especially those based on deep convolutional neural networks (DCNNs), require vast amounts of training data to achieve state-of-the-art performance. Decreasing this data requirement would…

Computer Vision and Pattern Recognition · Computer Science 2016-06-15 Maya Kabkab , Azadeh Alavi , Rama Chellappa

Subdata selection for big data regression: an improved approach

In the big data era researchers face a series of problems. Even standard approaches/methodologies, like linear regression, can be difficult or problematic with huge volumes of data. Traditional approaches for regression in big datasets may…

Methodology · Statistics 2024-11-13 Vasilis Chasiotis , Dimitris Karlis

Sketching Datasets for Large-Scale Learning (long version)

This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is…

Machine Learning · Statistics 2021-06-28 Rémi Gribonval , Antoine Chatalic , Nicolas Keriven , Vincent Schellekens , Laurent Jacques , Philip Schniter

Efficient Large-Scale Learning of Minimax Risk Classifiers

Supervised learning with large-scale data usually leads to complex optimization problems, especially for classification tasks with multiple classes. Stochastic subgradient methods can enable efficient learning with a large number of samples…

Machine Learning · Computer Science 2025-11-25 Kartheek Bondugula , Santiago Mazuelas , Aritz Pérez

Accelerating Deep Learning with Dynamic Data Pruning

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

Big Data Analysis Using Shrinkage Strategies

In this paper, we apply shrinkage strategies to estimate regression coefficients efficiently for the high-dimensional multiple regression model, where the number of samples is smaller than the number of predictors. We assume in the sparse…

Methodology · Statistics 2017-04-19 B. Yuzbasi , M. Arashi , S. E. Ahmed

Review Non-convex Optimization Method for Machine Learning

Non-convex optimization is a critical tool in advancing machine learning, especially for complex models like deep neural networks and support vector machines. Despite challenges such as multiple local minima and saddle points, non-convex…

Machine Learning · Computer Science 2024-10-04 Greg B Fotopoulos , Paul Popovich , Nicholas Hall Papadopoulos

Learning Options via Compression

Identifying statistical regularities in solutions to some tasks in multi-task reinforcement learning can accelerate the learning of new tasks. Skill learning offers one way of identifying these regularities by decomposing pre-collected…

Machine Learning · Computer Science 2022-12-12 Yiding Jiang , Evan Zheran Liu , Benjamin Eysenbach , Zico Kolter , Chelsea Finn

A constrained optimization approach to improve robustness of neural networks

In this paper, we present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining high accuracy on clean data. Our method introduces…

Machine Learning · Computer Science 2024-10-28 Shudian Zhao , Jan Kronqvist

Faster Learning by Reduction of Data Access Time

Nowadays, the major challenge in machine learning is the Big Data challenge. The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow. The…

Machine Learning · Computer Science 2018-07-26 Vinod Kumar Chauhan , Anuj Sharma , Kalpana Dahiya

Big Learning with Bayesian Methods

Explosive growth in data and availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems, and applications with Big Data.…

Machine Learning · Computer Science 2017-03-02 Jun Zhu , Jianfei Chen , Wenbo Hu , Bo Zhang

A Survey on Large-scale Machine Learning

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…

Optimization and Control · Mathematics 2020-08-28 Filip Hanzely

Adaptive Sampling Strategies for Stochastic Optimization

In this paper, we propose a stochastic optimization method that adaptively controls the sample size used in the computation of gradient approximations. Unlike other variance reduction techniques that either require additional storage or the…

Optimization and Control · Mathematics 2017-11-01 Raghu Bollapragada , Richard Byrd , Jorge Nocedal