数据分析、统计与概率
A geometric form of information theory allows for reasonable, i.e. probabilistic, evidence-ranking based, and generalized noise-level dependent, classifications of the crystallographic and quasicrystallographic symmetries in noisy digital…
We introduce the interactive tool pandemonium to cluster model predictions that depend on a set of parameters. The model predictions are used to define the coordinates in observable space which go into the clustering. The results of this…
Within the framework of likelihood-based statistical tests for high energy physics measurements, we derive generalized expressions for estimating the statistical significance of discovery using the asymptotic approximations of Wilks and…
Single particle imaging (SPI) at X-ray free electron lasers (XFELs) is particularly well suited to determine the 3D structure of particles in their native environment. For a successful reconstruction, diffraction patterns originating from a…
An accurate forecast of the red tide respiratory irritation level would improve the lives of many people living in areas affected by algal blooms. Using a decades-long database of daily beach conditions, two conceptually different models to…
Link prediction is a paradigmatic problem in network science, which aims at estimating the existence likelihoods of nonobserved links, based on known topology. After a brief introduction of the standard problem and metrics of link…
Basin stability (BS) is a measure of nonlinear stability in multi-stable dynamical systems. BS has previously been estimated using Monte-Carlo simulations, which requires the explicit knowledge of a dynamical model. We discuss the…
Time irreversibility is a common signature of nonlinear processes, and a fundamental property of non-equilibrium systems driven by non-conservative forces. A time series is said to be reversible if its statistical properties are invariant…
This work deals with the inference of catalytic recombination parameters from plasma wind tunnel experiments for reusable thermal protection materials. One of the critical factors affecting the performance of such materials is the…
We introduce an index based on information theory to quantify the stationarity of a stochastic process.The index compares on the one hand the information contained in the increment at the time scale $\tau$ of the process at time $t$ with,…
Astronomy, biophysics, and material science often depend on the possibility to extract information out of faint spatial signals. Here we present a morphometric analysis technique to quantify the shape of structural deviations in greyscale…
Autoencoders have useful applications in high energy physics in anomaly detection, particularly for jets - collimated showers of particles produced in collisions such as those at the CERN Large Hadron Collider. We explore the use of…
The particle-flow (PF) algorithm is used in general-purpose particle detectors to reconstruct a comprehensive particle-level view of the collision by combining information from different subdetectors. A graph neural network (GNN) model,…
When searching for radiological sources in an urban area, a vehicle-borne detector system will often measure complex, varying backgrounds primarily from natural gamma-ray sources. Much work has been focused on developing spectral algorithms…
In the data analysis of oscillatory systems, methods based on phase reconstruction are widely used to characterize phase-locking properties and inferring the phase dynamics. The main component in these studies is an extraction of the phase…
We study the avalanche statistics observed in a minimal random growth model. The growth is governed by a reproduction rate obeying a probability distribution with finite mean a and variance va. These two control parameters determine if the…
When dealing with non-stationary systems, for which many time series are available, it is common to divide time in epochs, i.e. smaller time intervals and deal with short time series in the hope to have some form of approximate stationarity…
Kernel methods are widely used for probability estimation by measuring the distribution of low-passed vector distances in reconstructed state spaces. However, the information conveyed by the vector distances that are greater than the…
Time irreversibility, defined as the lack of invariance of the statistical properties of a system or time series under the operation of time reversal, has received an increasing attention during the last decades, thanks to the information…
In this study, we present a method for classifying dynamical systems using a hybrid approach involving recurrence plots and a convolution neural network (CNN). This is performed by obtaining the recurrence matrix of a time series generated…