Related papers: R package SamplingStrata: new developments and ext…
The R package optimall offers a collection of functions that efficiently streamline the design process of sampling in surveys ranging from simple to complex. The package's main functions allow users to interactively define and adjust strata…
Predicting the cheapest sample size for the optimal stratification in multivariate survey design is a problem in cases where the population frame is large. A solution exists that iteratively searches for the minimum sample size necessary to…
Recent works have proposed optimal subsampling algorithms to improve computational efficiency in large datasets and to design validation studies in the presence of measurement error. Existing approaches generally fall into two categories:…
The optimum sample allocation in stratified sampling is one of the basic issues of survey methodology. It is a procedure of dividing the overall sample size into strata sample sizes in such a way that for given sampling designs in strata…
This paper investigates the use of stratified sampling as a variance reduction technique for approximating integrals over large dimensional spaces. The accuracy of this method critically depends on the choice of the space partition, the…
Online controlled experiments, also known as A/B testing, are the digital equivalent of randomized controlled trials for estimating the impact of marketing campaigns on website visitors. Stratified sampling is a traditional technique for…
We derive optimality conditions for the optimum sample allocation problem in stratified sampling, formulated as the determination of the fixed strata sample sizes that minimize the total cost of the survey, under the assumed level of…
The task of RNA design given a target structure aims to find a sequence that can fold into that structure. It is a computationally hard problem where some version(s) have been proven to be NP-hard. As a result, heuristic methods such as…
In sampling theory, stratification corresponds to a technique used in surveys, which allows segmenting a population into homogeneous subpopulations (strata) to produce statistics with a higher level of precision. In particular, this article…
Stratified random sampling (SRS) is a fundamental sampling technique that provides accurate estimates for aggregate queries using a small size sample, and has been used widely for approximate query processing. A key question in SRS is how…
In this paper, we propose a stratified sampling algorithm in which the random drawings made in the strata to compute the expectation of interest are also used to adaptively modify the proportion of further drawings in each stratum. These…
Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces…
In classification problems, sampling bias between training data and testing data is critical to the ranking performance of classification scores. Such bias can be both unintentionally introduced by data collection and intentionally…
In many randomized trials, outcomes such as essays or open-ended responses must be manually scored as a preliminary step to impact analysis, a process that is costly and limiting. Model-assisted estimation offers a way to combine surrogate…
Geographical data are generally autocorrelated. In this case, it is preferable to select spread units. In this paper, we propose a new method for selecting well-spread samples from a finite spatial population with equal or unequal inclusion…
The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can…
Subsampling is a widely used and effective approach for addressing the computational challenges posed by massive datasets. Substantial progress has been made in developing non-uniform, probability-based subsampling schemes that prioritize…
Optimal subset selection is an important task that has numerous algorithms designed for it and has many application areas. STPGA contains a special genetic algorithm supplemented with a tabu memory property (that keeps track of previously…
A new approach of obtaining stratified random samples from statistically dependent random variables is described. The proposed method can be used to obtain samples from the input space of a computer forward model in estimating expectations…
We report a new optimal resolution for the statistical stratification problem under proportional sampling allocation among strata. Consider a finite population of N units, a random sample of n units selected from this population and a…