English
Related papers

Related papers: A Bayesian algorithm for sample selection bias cor…

200 papers

In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first…

Methodology · Statistics 2019-01-08 Jae Kwang Kim , Zhonglei Wang

Big Data often presents as massive non-probability samples. Not only is the selection mechanism often unknown, but larger data volume amplifies the relative contribution of selection bias to total error. Existing bias adjustment approaches…

Methodology · Statistics 2022-03-29 Ali Rafei , Carol A. C. Flannagan , Brady T. West , Michael R. Elliott

Accurate and timely population data are essential for disaster response and humanitarian planning, but traditional censuses often cannot capture rapid demographic changes. Social media data offer a promising alternative for dynamic…

An ever-increasing deluge of big data is becoming available to national statistical offices globally, but it is well documented that statistics produced by big data alone often suffer from selection bias and are not usually representative…

Methodology · Statistics 2023-06-29 Ryan Covey

Non-representative surveys are commonly used and widely available but suffer from selection bias that generally cannot be entirely eliminated using weighting techniques. Instead, we propose a Bayesian method to synthesize longitudinal…

Methodology · Statistics 2024-07-08 Nathaniel Dyrkton , Paul Gustafson , Harlan Campbell

In the case of informative sampling the sampling scheme explicitly or implicitly depends on the response variable. As a result, the sample distribution of response variable can- not be used for making inference about the population. In this…

Applications · Statistics 2016-11-18 Anna Sikov

Online data has the potential to transform how researchers and companies produce election forecasts. Social media surveys, online panels and even comments scraped from the internet can offer valuable insights into political preferences.…

Applications · Statistics 2025-03-20 Alberto Arletti , Maria Letizia Tanturri , Omar Paccagnella

We consider inference from non-random samples in data-rich settings where high-dimensional auxiliary information is available both in the sample and the target population, with survey inference being a special case. We propose a regularized…

Methodology · Statistics 2021-04-13 Yutao Liu , Andrew Gelman , Qixuan Chen

Big data presents potential but unresolved value as a source for analysis and inference. However,selection bias, present in many of these datasets, needs to be accounted for so that appropriate inferences can be made on the target…

Methodology · Statistics 2025-01-09 Lyndon Ang , Robert Clark , Bronwyn Loong , Anders Holmberg

Non-Bayesian social learning enables multiple agents to conduct networked signal and information processing through observing environmental signals and information aggregating. Traditional non-Bayesian social learning models only consider…

Social and Information Networks · Computer Science 2024-07-31 Dongyan Sui , Weichen Cao , Stefan Vlaski , Chun Guan , Siyang Leng

This paper describes a Bayesian method for learning causal networks using samples that were selected in a non-random manner from a population of interest. Examples of data obtained by non-random sampling include convenience samples and…

Artificial Intelligence · Computer Science 2013-01-18 Gregory F. Cooper

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to…

Methodology · Statistics 2018-07-13 Luis G. Leon-Novelo , Terrance D. Savitsky

Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of…

Computers and Society · Computer Science 2019-05-16 Zijian Wang , Scott A. Hale , David Adelani , Przemyslaw A. Grabowicz , Timo Hartmann , Fabian Flöck , David Jurgens

Using social media data for statistical analysis of general population faces commonly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures…

Applications · Statistics 2019-05-03 Martina Patone , Li-Chun Zhang

In population studies, it is standard to sample data via designs in which the population is divided into strata, with the different strata assigned different probabilities of inclusion. Although there have been some proposals for including…

Methodology · Statistics 2014-09-29 T. Kunihama , A. H. Herring , C. T. Halpern , D. B. Dunson

We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where a data set is split in non-overlapping subsets. These subsets are only accessed locally by individual workers and no data is shared…

Computation · Statistics 2022-10-18 Alexander Buchholz , Daniel Ahfock , Sylvia Richardson

In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for…

Machine Learning · Statistics 2019-12-02 Q. A. Meertens , C. G. H. Diks , H. J. van den Herik , F W Takes

Non-probability sampling, for example in the form of online panels, has become a fast and cheap method to collect data. While reliable inference tools are available for classical probability samples, non-probability samples can yield…

Methodology · Statistics 2022-04-05 Gerhard Tutz

We propose the use of a simple intuitive principle for measuring algorithmic classification bias: the significance of the differences in a classifier's error rates across the various demographics is inversely commensurate with the sample…

Methodology · Statistics 2026-01-08 Ioannis Ivrissimtzis , Shauna Concannon , Matthew Houliston , Graham Roberts

We have developed a new Bayesian method to correct the flux densities of astronomical sources. The hybrid method combines a simulated likelihood to model survey selection together with an analytic source-count-based prior. The simulated…

Astrophysics of Galaxies · Physics 2020-04-29 Megan B. Gralla , Tobias A. Marriage
‹ Prev 1 2 3 10 Next ›