Related papers: Preferential Sampling for Bivariate Spatial Data
An evolving problem in the field of spatial and ecological statistics is that of preferential sampling, where biases may be present due to a relationship between sample data locations and a response of interest. This field of research bears…
This paper analyses the effect of preferential sampling in Geostatistics when the choice of new sampling locations is the main interest of the researcher. A Bayesian criterion based on maximizing utility functions is used. Simulated studies…
Presence/absence data and presence-only data are the two customary sources for learning about species distributions over a region. We illuminate the fundamental modeling differences between the two types of data. Most simply, locations are…
This paper explores the topic of preferential sampling, specifically situations where monitoring sites in environmental networks are preferentially located by the designers. This means the data arising from such networks may not accurately…
Preferential sampling is a common feature in geostatistics and occurs when the locations to be sampled are chosen based on information about the phenomena under study. In this case, point pattern models are commonly used as the probability…
Marine Protected Areas (MPAs) have been established globally to conserve marine resources. Given their maintenance costs and impact on commercial fishing, it is critical to evaluate their effectiveness to support future conservation. In…
Preferential sampling has attracted considerable attention in geostatistics since the pioneering work of Diggle et al. (2010). A variety of likelihood-based approaches have been developed to correct estimation bias by explicitly modelling…
Continuous space species distribution models (SDMs) have a long-standing history as a valuable tool in ecological statistical analysis. Geostatistical and preferential models are both common models in ecology. Geostatistical models are…
In the study of natural and artificial complex systems, responses that are not completely determined by the considered decision variables are commonly modelled probabilistically, resulting in response distributions varying across decision…
Spatial small area estimation models have become very popular in some contexts, such as disease mapping. Data in disease mapping studies are exhaustive, that is, the available data are supposed to be a complete register of all the…
Survey sampling plays an important role in the efficient allocation and management of resources. The essence of survey sampling lies in acquiring a sample of data points from a population and subsequently using this sample to estimate the…
Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting…
Accurately measuring discrimination is crucial to faithfully assessing fairness of trained machine learning (ML) models. Any bias in measuring discrimination leads to either amplification or underestimation of the existing disparity.…
This paper presents a general model framework for detecting the preferential sampling of environmental monitors recording an environmental process across space and/or time. This is achieved by considering the joint distribution of an…
Joint modeling of spatially-oriented dependent variables is commonplace in the environmental sciences, where scientists seek to estimate the relationships among a set of environmental outcomes accounting for dependence among these outcomes…
Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the…
Spatial confounding is a fundamental issue in spatial regression models which arises because spatial random effects, included to approximate unmeasured spatial variation, are typically not independent of covariates in the model. This can…
In the last 25 years there has been an important increase in the amount of data collected from animal-mounted sensors (bio-probes), which are often used to study the animals' behaviour or environment. We focus here on an example of the…
Biased sampling designs can be highly efficient when studying rare (binary) or low variability (continuous) endpoints. We consider longitudinal data settings in which the probability of being sampled depends on a repeatedly measured…
Standard supervised machine learning assumes that the distribution of the source samples used to train an algorithm is the same as the one of the target samples on which it is supposed to make predictions. However, as any data scientist…