English
Related papers

Related papers: Subset Sampling and Its Extensions

200 papers

We study the fundamental problem of sampling independent events, called subset sampling. Specifically, consider a set of $n$ events $S=\{x_1, \ldots, x_n\}$, where each event $x_i$ has an associated probability $p(x_i)$. The subset sampling…

Data Structures and Algorithms · Computer Science 2023-09-22 Lu Yi , Hanzhi Wang , Zhewei Wei

We study the selection problem, namely that of computing the $i$th order statistic of $n$ given elements. Here we offer a data structure called \emph{selectable sloppy heap} handling a dynamic version in which upon request: (i)~a new…

Data Structures and Algorithms · Computer Science 2017-08-11 Adrian Dumitrescu

This paper addresses the Poisson $\pi$ps sampling problem, a topic of significant academic interest in various domains and with practical data mining applications, such as influence maximization. The problem includes a set $\mathcal{S}$ of…

Databases · Computer Science 2024-12-30 Jinchao Huang , Sibo Wang

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down…

Machine Learning · Computer Science 2024-09-25 Guy Blanc

Running machine learning algorithms on large and rapidly growing volumes of data is often computationally expensive, one common trick to reduce the size of a data set, and thus reduce the computational cost of machine learning algorithms,…

Machine Learning · Computer Science 2022-01-25 Shaojie Tang , Jing Yuan

The note studies the problem of selecting a good enough subset out of a finite number of alternatives under a fixed simulation budget. Our work aims to maximize the posterior probability of correctly selecting a good subset. We formulate…

Optimization and Control · Mathematics 2023-05-09 Gongbo Zhang , Bin Chen , Qing-shan Jia , Yijie Peng

The subset sum problem (SSP) can be briefly stated as: given a target integer $E$ and a set $A$ containing $n$ positive integer $a_j$, find a subset of $A$ summing to $E$. The \textit{density} $d$ of an SSP instance is defined by the ratio…

Data Structures and Algorithms · Computer Science 2008-06-23 Changlin Wan , Zhongzhi Shi

The Subset Sum Problem is a fundamental NP-complete problem in cryptography and combinatorial optimization, with many real-world applications. The Random Subset Sum Problem (RSSP) is a more applicable version of subset sum, where numbers…

Data Structures and Algorithms · Computer Science 2026-05-21 Edwin Chen , Christof Teuscher

In this paper, we study the Dynamic Parameterized Subset Sampling (DPSS) problem in the Word RAM model. In DPSS, the input is a set,~$S$, of~$n$ items, where each item,~$x$, has a non-negative integer weight,~$w(x)$. Given a pair of query…

Data Structures and Algorithms · Computer Science 2024-09-27 Junhao Gan , Seeun William Umboh , Hanzhi Wang , Anthony Wirth , Zhuo Zhang

We consider the problem of sampling $n$ numbers from the range $\{1,\ldots,N\}$ without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and…

Data Structures and Algorithms · Computer Science 2019-11-18 Peter Sanders , Sebastian Lamm , Lorenz Hübschle-Schneider , Emanuel Schrade , Carsten Dachsbacher

We study a ranking and selection (R&S) problem when all solutions share common parametric Bayesian input models updated with the data collected from multiple independent data-generating sources. Our objective is to identify the best system…

Methodology · Statistics 2025-02-25 Eunhye Song , Taeho Kim

Suppose we have a memory storing $0$s and $1$s and we want to estimate the frequency of $1$s by sampling. We want to do this I/O-efficiently, exploiting that each read gives a block of $B$ bits at unit cost; not just one bit. If the input…

Data Structures and Algorithms · Computer Science 2024-10-21 Shyam Narayanan , Václav Rozhoň , Jakub Tětek , Mikkel Thorup

In the range $\alpha$-majority query problem, we are given a sequence $S[1..n]$ and a fixed threshold $\alpha \in (0, 1)$, and are asked to preprocess $S$ such that, given a query range $[i..j]$, we can efficiently report the symbols that…

Data Structures and Algorithms · Computer Science 2018-05-24 Travis Gagie , Meng He , Gonzalo Navarro

Sampling is a fundamental problem in computer science and statistics. However, for a given task and stream, it is often not possible to choose good sampling probabilities in advance. We derive a general framework for adaptively changing the…

Machine Learning · Statistics 2022-06-16 Daniel Ting

In recent years, the problem of computing the frequencies of the induced $k$-vertex subgraphs of a graph, or \emph{$k$-graphlets}, has become central. One approach for this problem is to sample $k$-graphlets randomly. Classic algorithms for…

Data Structures and Algorithms · Computer Science 2026-04-29 Marco Bressan , T-H. Hubert Chan , Qipeng Kuang , Mauro Sozio

Sample selection improves the efficiency and effectiveness of machine learning models by providing informative and representative samples. Typically, samples can be modeled as a sample graph, where nodes are samples and edges represent…

Machine Learning · Computer Science 2025-03-04 Tianchi Xie , Jiangning Zhu , Guozu Ma , Minzhi Lin , Wei Chen , Weikai Yang , Shixia Liu

Consistent sampling is a technique for specifying, in small space, a subset $S$ of a potentially large universe $U$ such that the elements in $S$ satisfy a suitably chosen sampling condition. Given a subset $\mathcal{I}\subseteq U$ it…

Data Structures and Algorithms · Computer Science 2014-04-21 Konstantin Kutzkov , Rasmus Pagh

We revisit the optimization from samples (OPS) model, which studies the problem of optimizing objective functions directly from the sample data. Previous results showed that we cannot obtain a constant approximation ratio for the maximum…

Machine Learning · Computer Science 2020-07-07 Wei Chen , Xiaoming Sun , Jialin Zhang , Zhijie Zhang

We consider the problem of storing a dynamic string $S$ over an alphabet $\Sigma=\{\,1,\ldots,\sigma\,\}$ in compressed form. Our representation supports insertions and deletions of symbols and answers three fundamental queries:…

Data Structures and Algorithms · Computer Science 2015-07-27 J. Ian Munro , Yakov Nekrich

This paper presents a novel algorithm solving the classic problem of generating a random sample of size s from population of size n with non-uniform probabilities. The sampling is done with replacement. The algorithm requires constant…

Data Structures and Algorithms · Computer Science 2016-11-03 Michał Startek
‹ Prev 1 2 3 10 Next ›