English
Related papers

Related papers: Optimal Sampling Gaps for Adaptive Submodular Maxi…

200 papers

A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. Unfortunately, the resulting submodular optimization…

Machine Learning · Computer Science 2015-04-23 Rafael da Ponte Barbosa , Alina Ene , Huy L. Nguyen , Justin Ward

Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the…

Methodology · Statistics 2023-11-16 Haohui Han , Liya Fu

Sampling is a fundamental problem in computer science and statistics. However, for a given task and stream, it is often not possible to choose good sampling probabilities in advance. We derive a general framework for adaptively changing the…

Machine Learning · Statistics 2022-06-16 Daniel Ting

The era of huge data necessitates highly efficient machine learning algorithms. Many common machine learning algorithms, however, rely on computationally intensive subroutines that are prohibitively expensive on large datasets. Oftentimes,…

Machine Learning · Computer Science 2023-09-26 Mo Tiwari

This paper addresses the problem of sequential submodular maximization: selecting and ranking items in a sequence to optimize some composite submodular function. In contrast to most of the previous works, which assume access to the utility…

Machine Learning · Computer Science 2024-09-10 Jing Yuan , Shaojie Tang

We propose subsampling as a unified algorithmic technique for submodular maximization in centralized and online settings. The idea is simple: independently sample elements from the ground set, and use simple combinatorial techniques (such…

Data Structures and Algorithms · Computer Science 2021-04-08 Christopher Harshaw , Ehsan Kazemi , Moran Feldman , Amin Karbasi

Subsampling algorithms for various parametric regression models with massive data have been extensively investigated in recent years. However, all existing studies on subsampling heavily rely on clean massive data. In practical…

Statistics Theory · Mathematics 2025-06-11 Jiangshan Ju , Mingqiu Wang , Shengli Zhao

Almost every software system provides configuration options to tailor the system to the target platform and application scenario. Often, this configurability renders the analysis of every individual system configuration infeasible. To…

Software Engineering · Computer Science 2016-02-17 Flávio Medeiros , Christian Kästner , Márcio Ribeiro , Rohit Gheyi , Sven Apel

With appropriately chosen sampling probabilities, sampling-based random projection can be used to implement large-scale statistical methods, substantially reducing computational cost while maintaining low statistical error. However,…

Machine Learning · Statistics 2026-01-13 Yifan Chen , Yun Yang

For massive data, the family of subsampling algorithms is popular to downsize the data volume and reduce computational burden. Existing studies focus on approximating the ordinary least squares estimate in linear regression, where…

Computation · Statistics 2019-06-27 HaiYing Wang , Rong Zhu , Ping Ma

We compute the integral of a function or the expectation of a random variable with minimal cost and use, for our new algorithm and for upper bounds of the complexity, i.i.d. samples. Under certain assumptions it is possible to select a…

Numerical Analysis · Mathematics 2018-10-24 Robert J. Kunsch , Erich Novak , Daniel Rudolf

Efficient sampling in biomolecular simulations is critical for accurately capturing the complex dynamical behaviors of biological systems. Adaptive sampling techniques aim to improve efficiency by focusing computational resources on the…

Biomolecules · Quantitative Biology 2024-10-22 Hassan Nadeem , Diwakar Shukla

The goal of a sequential decision making problem is to design an interactive policy that adaptively selects a group of items, each selection is based on the feedback from the past, in order to maximize the expected utility of selected…

Data Structures and Algorithms · Computer Science 2022-09-13 Shaojie Tang

This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator). We first consider $\gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space…

Machine Learning · Computer Science 2025-03-18 Gen Li , Yuting Wei , Yuejie Chi , Yuxin Chen

Several large-scale machine learning tasks, such as data summarization, can be approached by maximizing functions that satisfy submodularity. These optimization problems often involve complex side constraints, imposed by the underlying…

Data Structures and Algorithms · Computer Science 2021-02-15 Francesco Quinzan , Vanja Doskoč , Andreas Göbel , Tobias Friedrich

Data augmentation is commonly used to encode invariances in learning methods. However, this process is often performed in an inefficient manner, as artificial examples are created by applying a number of transformations to all points in the…

Machine Learning · Computer Science 2019-03-04 Michael Kuchnik , Virginia Smith

A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data is subsampling, by which one takes a random subsample…

Methodology · Statistics 2015-11-24 Rong Zhu , Ping Ma , Michael W. Mahoney , Bin Yu

Adaptive sampling algorithms are modern and efficient methods that dynamically adjust the sample size throughout the optimization process. However, they may encounter difficulties in risk-averse settings, particularly due to the challenge…

Optimization and Control · Mathematics 2025-02-17 Sandra Pieraccini , Tommaso Vanzan

A variety of large-scale machine learning problems can be cast as instances of constrained submodular maximization. Existing approaches for distributed submodular maximization have a critical drawback: The capacity - number of instances…

Machine Learning · Statistics 2016-06-01 Mario Lucic , Olivier Bachem , Morteza Zadimoghaddam , Andreas Krause

Measurement-constrained datasets, often encountered in semi-supervised learning, arise when data labeling is costly, time-intensive, or hindered by confidentiality or ethical concerns, resulting in a scarcity of labeled data. In certain…

Methodology · Statistics 2025-01-15 Yixin Shen , Yang Ning
‹ Prev 1 2 3 10 Next ›