English
Related papers

Related papers: Estimation from Partially Sampled Distributed Trac…

200 papers

Random sampling is an essential tool in the processing and transmission of data. It is used to summarize data too large to store or manipulate and meet resource constraints on bandwidth or battery power. Estimators that are applied to the…

Databases · Computer Science 2015-03-19 Edith Cohen , Haim Kaplan

Since only a small number of traces generated from distributed tracing helps in troubleshooting, its storage requirement can be significantly reduced by biasing the selection towards anomalous traces. To aid in this scenario, we propose…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-11 Alim Ul Gias , Yicheng Gao , Matthew Sheldon , José A. Perusquía , Owen O'Brien , Giuliano Casale

The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can…

Machine Learning · Computer Science 2019-07-29 Tiancheng Yu , Xiyu Zhai , Suvrit Sra

The problem of estimation of the proportion of units with a given attribute in a~finite population is considered. From the population a sample is drawn due to the simple random sampling without replacement. There are limited funds for…

Statistics Theory · Mathematics 2019-03-26 Dominik Sieradzki , Wojciech Zieliński

Distance queries are a basic tool in data analysis. They are used for detection and localization of change for the purpose of anomaly detection, monitoring, or planning. Distance queries are particularly useful when data sets such as…

Data Structures and Algorithms · Computer Science 2015-03-20 Edith Cohen

A new approach of obtaining stratified random samples from statistically dependent random variables is described. The proposed method can be used to obtain samples from the input space of a computer forward model in estimating expectations…

Methodology · Statistics 2019-11-25 Anirban Mondal , Abhijit Mandal

We develop a new sampling method to estimate eigenvector centrality on incomplete networks. Our goal is to estimate this global centrality measure having at disposal a limited amount of data. This is the case in many real-world scenarios…

Social and Information Networks · Computer Science 2020-10-29 Nicolò Ruggeri , Caterina De Bacco

Sampling is an important tool for estimating large, complex sums and integrals over high dimensional spaces. For instance, important sampling has been used as an alternative to exact methods for inference in belief networks. Ideally, we…

Artificial Intelligence · Computer Science 2013-01-18 Luis E. Ortiz , Leslie Pack Kaelbling

Measurement samples are often taken in various monitoring applications. To reduce the sensing cost, it is desirable to achieve better sensing quality while using fewer samples. Compressive Sensing (CS) technique finds its role when the…

Information Theory · Computer Science 2016-11-18 Ying Li , Kun Xie , Xin Wang

Sparse methods are the standard approach to obtain interpretable models with high prediction accuracy. Alternatively, algorithmic ensemble methods can achieve higher prediction accuracy at the cost of loss of interpretability. However, the…

Methodology · Statistics 2022-01-11 Anthony Christidis , Stefan Van Aelst , Ruben Zamar

In this paper, we propose a stratified sampling algorithm in which the random drawings made in the strata to compute the expectation of interest are also used to adaptively modify the proportion of further drawings in each stratum. These…

Methodology · Statistics 2007-12-04 Pierre Etore , Benjamin Jourdain

Data splitting divides data into two parts. One part is reserved for model selection. In some applications, the second part is used for model validation but we use this part for estimating the parameters of the chosen model. We focus on the…

Methodology · Statistics 2016-01-20 Julian J. Faraway

Statistical inference is often simplified by sample-splitting. This simplification comes at the cost of the introduction of randomness not native to the data. We propose a simple procedure for sequentially aggregating statistics constructed…

Econometrics · Economics 2024-11-18 David M. Ritzwoller , Joseph P. Romano

Statistical samples, in order to be representative, have to be drawn from a population in a random and unbiased way. Nevertheless, it is common practice in the field of model-based diagnosis to make estimations from (biased) best-first…

Artificial Intelligence · Computer Science 2022-08-05 Patrick Rodler , Fatima Elichanova

Partially-observed data collected by sampling methods is often being studied to obtain the characteristics of information diffusion networks. However, these methods usually do not consider the behavior of diffusion process. In this paper,…

Social and Information Networks · Computer Science 2014-05-30 Motahareh Eslami Mehdiabadi , Hamid R. Rabiee , Mostafa Salehi

As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine…

Machine Learning · Statistics 2019-12-10 Biyi Fang , Diego Klabjan

In recent years, an increasing amount of data is collected in different and often, not cooperative, databases. The problem of privacy-preserving, distributed calculations over separated databases and, a relative to it, issue of private data…

Databases · Computer Science 2016-05-23 Philip Derbeko , Shlomi Dolev , Ehud Gudes , Jeffrey D. Ullman

The problem of covariance estimation for replicated surface-valued processes is examined from the functional data analysis perspective. Considerations of statistical and computational efficiency often compel the use of separability of the…

Methodology · Statistics 2021-10-25 Tomas Masak , Victor M. Panaretos

Network sampling is a crucial technique for analyzing large or partially observable networks. However, the effectiveness of different sampling methods can vary significantly depending on the context. In this study, we empirically compare…

Social and Information Networks · Computer Science 2025-05-05 Quoc Chuong Nguyen

Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under…

Data Analysis, Statistics and Probability · Physics 2007-05-23 Radford M. Neal
‹ Prev 1 2 3 10 Next ›