Related papers: On Data Analysis Pipelines and Modular Bayesian Mo…
Standard Bayesian inference can build models that combine information from various sources, but this inference may not be reliable if components of a model are misspecified. Cut inference, as a particular type of modularized Bayesian…
Hierarchical models are versatile tools for joint modeling of data sets arising from different, but related, sources. Fully Bayesian inference may, however, become computationally prohibitive if the source-specific data models are complex,…
There has been much recent interest in modifying Bayesian inference for misspecified models so that it is useful for specific purposes. One popular modified Bayesian inference method is "cutting feedback" which can be used when the model…
This work considers Bayesian inference under misspecification for complex statistical models comprised of simpler submodels, referred to as modules, that are coupled together. Such ``multi-modular" models often arise when combining…
The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and fit…
Modular Bayesian methods perform inference in models that are specified through a collection of coupled sub-models, known as modules. These modules often arise from modelling different data sources or from combining domain knowledge from…
Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit…
Bayesian modelling enables us to accommodate complex forms of data and make a comprehensive inference, but the effect of partial misspecification of the model is a concern. One approach in this setting is to modularize the model, and…
Statistical models typically capture uncertainties in our knowledge of the corresponding real-world processes, however, it is less common for this uncertainty specification to capture uncertainty surrounding the values of the inputs to the…
There are many issues that can cause problems when attempting to infer model parameters from data. Data and models are both imperfect, and as such there are multiple scenarios in which standard methods of inference will lead to misleading…
A data analysis pipeline is a structured sequence of steps that transforms raw data into meaningful insights by integrating multiple analysis algorithms. In many practical applications, analytical findings are obtained only after data pass…
Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of…
When partitioning workflows in realistic scenarios, the knowledge of the processing units is often vague or unknown. A naive approach to addressing this issue is to perform many controlled experiments for different workloads, each…
The effectiveness of the machine learning methods for real-world tasks depends on the proper structure of the modeling pipeline. The proposed approach is aimed to automate the design of composite machine learning pipelines, which is…
Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows…
A data analysis pipeline is a structured sequence of steps that transforms raw data into meaningful insights by integrating various analysis algorithms. In this paper, we propose a novel statistical test to assess the significance of data…
Deep learning-based segmentation and classification are crucial to large-scale biomedical imaging, particularly for 3D data, where manual analysis is impractical. Although many methods exist, selecting suitable models and tuning parameters…
Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of…
Recursive Bayesian inference, in which posterior beliefs are updated in light of accumulating data, is a tool for implementing Bayesian models in applications with streaming and/or very large data sets. As the posterior of one iteration…
Modern data science relies on data analytic pipelines to organize interdependent computational steps. Such analytic pipelines often involve different algorithms across multiple steps, each with its own hyperparameters. To achieve the best…