数据分析、统计与概率
We are in an era of precision measurements at the Large Hadron Collider. The precision that can be achieved on some of those is limited however due to large systematic uncertainties. This paper introduces a new technique to reduce the total…
Slender marine structures such as deep-water marine risers are subjected to currents and will normally experience Vortex Induced Vibrations (VIV), which can cause fast accumulation of fatigue damage. The ocean current is often…
In this paper, we consider the problem of designing a training set using the most informative molecules from a specified library to build data-driven molecular property models. Specifically, we use (i) sparse generalized group additivity…
Metrics of model goodness-of-fit, model comparison, and model parameter estimation are the main categories of statistical problems in science. Bayesian and frequentist methods that address these questions often rely on a likelihood…
Traditionally, events collected at relativistic heavy-ion colliders are classified according to some centrality estimator (e.g. the number of produced charged particles) related to the initial energy density and volume of the system. In a…
The advent of fabrication techniques such as additive manufacturing has focused attention on the considerable variability of material response due to defects and other microstructural aspects. This variability motivates the development of…
The second derivative image (SDI) method is widely applied to sharpen dispersive data features in multi-dimensional spectroscopies such as angle resolved photoemission spectroscopy (ARPES). Here, the SDI function is represented in Fourier…
Explosions near ground generate multiple geophysical waveforms in the radiation-dominated range of their signature fields. Multi-phenomological explosion monitoring (MultiPEM) at these ranges requires the predictive capability to forecast…
Intense short-wavelength pulses from free-electron lasers and high-harmonic-generation sources enable diffractive imaging of individual nano-sized objects with a single x-ray laser shot. The enormous data sets with up to several million…
In this paper we introduce the horizon visibility graph, a simple extension to the popular horizontal visibility graph representation of a time series, and show that it possesses a rigorous mathematical foundation in computational algebraic…
Standard statistical analysis is unable to provide reliable confidence intervals on expectation values of probability distributions that do not satisfy the conditions of the central limit theorem. We present a regression-based estimator of…
Although a system is described by a well-known set of equations leading to a deterministic behavior, in the real world the value of a measurand obtained by an experiment will mostly scatter. Accordingly, an uncertainty is associated with…
Training features used to analyse physical processes are often highly correlated and determining which ones are most important for the classification is a non-trivial tasks. For the use case of a search for a top-quark pair produced in…
The organization of the distributed user analysis on the Worldwide LHC Computing Grid (WLCG) infrastructure is one of the most challenging tasks among the computing activities at the Large Hadron Collider. The Experiment Dashboard offers a…
The Bayesian Block algorithm, originally developed for applications in astronomy, can be used to improve the binning of histograms in high energy physics. The visual improvement can be dramatic, as shown here with two simple examples. More…
The interaction between a turbulent convective boundary layer (CBL) and the underlying land surface is an important research problem in the geosciences. In order to model this interaction adequately, it is necessary to develop tools which…
Least-squares fits are an important tool in many data analysis applications. In this paper, we review theoretical results, which are relevant for their application to data from counting experiments. Using a simple example, we illustrate the…
The second edition of ISO 19229 expands the guidance in its predecessor in two ways. Firstly, it provides more support and examples describing possible experimental approaches for purity analysis. A novelty is that it describes how the beta…
Engineering simulators used for steady-state multiphase pipe flows are commonly utilized to predict pressure drop. Such simulators are typically based on either empirical correlations or first-principles mechanistic models. The simulators…
In this paper, we consider a surrogate modeling approach using a data-driven nonparametric likelihood function constructed on a manifold on which the data lie (or to which they are close). The proposed method represents the likelihood…