Related papers: A Bayesian algorithm for sample selection bias cor…

Sampling techniques for big data analysis in finite population inference

In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first…

Methodology · Statistics 2019-01-08 Jae Kwang Kim , Zhonglei Wang

Robust Bayesian Inference for Big Data: Combining Sensor-based Records with Traditional Survey Data

Big Data often presents as massive non-probability samples. Not only is the selection mechanism often unknown, but larger data volume amplifies the relative contribution of selection bias to total error. Existing bias adjustment approaches…

Methodology · Statistics 2022-03-29 Ali Rafei , Carol A. C. Flannagan , Brady T. West , Michael R. Elliott

Social Media Data for Population Mapping: A Bayesian Approach to Address Representativeness and Privacy Challenges

Accurate and timely population data are essential for disaster response and humanitarian planning, but traditional censuses often cannot capture rapid demographic changes. Social media data offer a promising alternative for dynamic…

Applications · Statistics 2026-01-30 Paolo Andrich , Shengjie Lai , Halim Jun , Qianwen Duan , Zhifeng Cheng , Seth R. Flaxman , Andrew J. Tatem

Integrating Big Data and Survey Data for Efficient Estimation of the Median

An ever-increasing deluge of big data is becoming available to national statistical offices globally, but it is well documented that statistics produced by big data alone often suffer from selection bias and are not usually representative…

Methodology · Statistics 2023-06-29 Ryan Covey

Integrating representative and non-representative survey data for efficient inference

Non-representative surveys are commonly used and widely available but suffer from selection bias that generally cannot be entirely eliminated using weighting techniques. Instead, we propose a Bayesian method to synthesize longitudinal…

Methodology · Statistics 2024-07-08 Nathaniel Dyrkton , Paul Gustafson , Harlan Campbell

Bayesian Approach to Handling Informative Sampling

In the case of informative sampling the sampling scheme explicitly or implicitly depends on the response variable. As a result, the sample distribution of response variable can- not be used for making inference about the population. In this…

Applications · Statistics 2016-11-18 Anna Sikov

Making Online Polls More Accurate: Statistical Methods Explained

Online data has the potential to transform how researchers and companies produce election forecasts. Social media surveys, online panels and even comments scraped from the internet can offer valuable insights into political preferences.…

Applications · Statistics 2025-03-20 Alberto Arletti , Maria Letizia Tanturri , Omar Paccagnella

Inference from Non-Random Samples Using Bayesian Machine Learning

We consider inference from non-random samples in data-rich settings where high-dimensional auxiliary information is available both in the sample and the target population, with survey inference being a special case. We propose a regularized…

Methodology · Statistics 2021-04-13 Yutao Liu , Andrew Gelman , Qixuan Chen

Estimating Propensities of Selection for Big Datasets via Data Integration

Big data presents potential but unresolved value as a source for analysis and inference. However,selection bias, present in many of these datasets, needs to be accounted for so that appropriate inferences can be made on the target…

Methodology · Statistics 2025-01-09 Lyndon Ang , Robert Clark , Bronwyn Loong , Anders Holmberg

Non-Bayesian Social Learning with Multiview Observations

Non-Bayesian social learning enables multiple agents to conduct networked signal and information processing through observing environmental signals and information aggregating. Traditional non-Bayesian social learning models only consider…

Social and Information Networks · Computer Science 2024-07-31 Dongyan Sui , Weichen Cao , Stefan Vlaski , Chun Guan , Siyang Leng

A Bayesian Method for Causal Modeling and Discovery Under Selection

This paper describes a Bayesian method for learning causal networks using samples that were selected in a non-random manner from a population of interest. Examples of data obtained by non-random sampling include convenience samples and…

Artificial Intelligence · Computer Science 2013-01-18 Gregory F. Cooper

Fully Bayesian Estimation Under Informative Sampling

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to…

Methodology · Statistics 2018-07-13 Luis G. Leon-Novelo , Terrance D. Savitsky

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of…

Computers and Society · Computer Science 2019-05-16 Zijian Wang , Scott A. Hale , David Adelani , Przemyslaw A. Grabowicz , Timo Hartmann , Fabian Flöck , David Jurgens

On two existing approaches to statistical analysis of social media data

Using social media data for statistical analysis of general population faces commonly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures…

Applications · Statistics 2019-05-03 Martina Patone , Li-Chun Zhang

Nonparametric Bayes modeling with sample survey weights

In population studies, it is standard to sample data via designs in which the population is divided into strata, with the different strata assigned different probabilities of inclusion. Although there have been some proposals for including…

Methodology · Statistics 2014-09-29 T. Kunihama , A. H. Herring , C. T. Halpern , D. B. Dunson

Distributed Computation for Marginal Likelihood based Model Choice

We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where a data set is split in non-overlapping subsets. These subsets are only accessed locally by individual workers and no data is shared…

Computation · Statistics 2022-10-18 Alexander Buchholz , Daniel Ahfock , Sylvia Richardson

A Bayesian Approach for Accurate Classification-Based Aggregates

In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for…

Machine Learning · Statistics 2019-12-02 Q. A. Meertens , C. G. H. Diks , H. J. van den Herik , F W Takes

Probability and Non-Probability Samples: Improving Regression Modeling by Using Data from Different Sources

Non-probability sampling, for example in the form of online panels, has become a fast and cheap method to collect data. While reliable inference tools are available for classical probability samples, non-probability samples can yield…

Methodology · Statistics 2022-04-05 Gerhard Tutz

Measures of classification bias derived from sample size analysis

We propose the use of a simple intuitive principle for measuring algorithmic classification bias: the significance of the differences in a classifier's error rates across the various demographics is inversely commensurate with the sample…

Methodology · Statistics 2026-01-08 Ioannis Ivrissimtzis , Shauna Concannon , Matthew Houliston , Graham Roberts

Accounting for selection bias using simulations: A general method and an application to millimeter-wavelength surveys

We have developed a new Bayesian method to correct the flux densities of astronomical sources. The hybrid method combines a simulated likelihood to model survey selection together with an analytic source-count-based prior. The simulated…

Astrophysics of Galaxies · Physics 2020-04-29 Megan B. Gralla , Tobias A. Marriage