Related papers: Testing Multispecies Coalescent Simulators using S…
Because biological processes can make different loci have different evolutionary histories, species tree estimation requires multiple loci from across the genome. While many processes can result in discord between gene trees and species…
Phylogenetic species trees typically represent the speciation history as a bifurcating tree. Speciation events that simultaneously create more than two descendants, thereby creating polytomies in the phylogeny, are possible. Moreover, the…
A method was developed for Bayesian inference of species phylogeny using the multi-species coalescent model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we…
Using topological summaries of gene trees as a basis for species tree inference is a promising approach to obtain acceptable speed on genomic-scale datasets, and to avoid some undesirable modeling assumptions. Here we study the…
It is often difficult to quantitatively determine if a new molecular simulation algorithm or software properly implements sampling of the desired thermodynamic ensemble. We present some simple statistical analysis procedures to allow…
The classic multispecies coalescent (MSC) model provides the means for theoretical justification of incomplete lineage sorting-aware species tree inference methods. A large body of work in phylogenetics is dedicated to the design of…
The multi-species coalescent provides an elegant theoretical framework for estimating species trees and species demographics from genetic markers. Practical applications of the multi-species coalescent model are, however, limited by the…
Computer simulations have become an important tool across the biomedical sciences and beyond. For many important problems several different models or hypotheses exist and choosing which one best describes reality or observed data is not…
This article describes SimEngine, an open-source R package for structuring, maintaining, running, and debugging statistical simulations on both local and cluster-based computing environments. Several R packages exist for structuring…
Quantum simulators, in which well controlled quantum systems are used to reproduce the dynamics of less understood ones, have the potential to explore physics that is inaccessible to modeling with classical computers. However, checking the…
A boson sampling device could efficiently sample from the output probability distribution of noninteracting bosons undergoing many-body interference. This problem is not only classically intractable, but its solution is also believed to be…
Large-scale statistical analysis of data sets associated with genome sequences plays an important role in modern biology. A key component of such statistical analyses is the computation of $p$-values and confidence bounds for statistics…
Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common…
In this paper, we investigate the problem of assessing statistical methods and effectively summarizing results from simulations. Specifically, we consider problems of the type where multiple methods are compared on a reasonably large test…
We present a new data-driven benchmark system to evaluate the performance of new MCMC samplers. Taking inspiration from the COCO benchmark in optimization, we view this task as having critical importance to machine learning and statistics…
Assessing the validity of user simulators when used for the evaluation of information retrieval systems remains an open question, constraining their effective use and the reliability of simulation-based results. To address this issue, we…
This work develops formal statistical inference procedures for machine learning ensemble methods. Ensemble methods based on bootstrapping, such as bagging and random forests, have improved the predictive accuracy of individual trees, but…
This survey aims at providing a comprehensive overview of the recent trends in the field of modeling and simulation (M&S) of interactions between users and recommender systems and applications of the M&S to the performance improvement of…
The R package MfUSampler provides Monte Carlo Markov Chain machinery for generating samples from multivariate probability distributions using univariate sampling algorithms such as Slice Sampler and Adaptive Rejection Sampler. The sampler…
In urgent decision making applications, ensemble simulations are an important way to determine different outcome scenarios based on currently available data. In this paper, we will analyze the output of ensemble simulations by considering…