数据分析、统计与概率
Phase space reconstruction (PSR) methods allow for the analysis of low-dimensional data with methods from dynamical system theory, but their application to prediction models, like those from machine learning (ML), is limited. Therefore, we…
In this article we offer a comprehensive analysis of the Urysohn's classifier in a binary classification context. It utilizes Urysohn's Lemma of Topology to construct separating functions, providing rigorous and adaptable solutions.…
We introduce a new technique called Drapes to enhance the sensitivity in searches for new physics at the LHC. By training diffusion models on side-band data, we show how background templates for the signal region can be generated either…
The "100-year flood" is commonly used, for instance in newspapers, but flood hazard assessment is more complex than it seems. We first describe an animation entitled "bag of floods" to make flood quantiles more concrete, using marbles whose…
To properly estimate signal significance while accounting for both statistical and systematic uncertainties, we conducted a study to analyze the impact of typical systematic uncertainties, such as background shape, signal shape, and the…
Variational data assimilation in ocean models depends on the ability to model general correlation operators in the presence of coastlines. Grid-point filters based on diffusion operators are widely used for this purpose, but come with a…
We discuss the use of Gaussian random fields to estimate the look-elsewhere effect correction. We show that Gaussian random fields can be used to model the null-hypothesis significance maps from a large set of statistical problems commonly…
Recent work has demonstrated that graph neural networks (GNNs) can match the performance of traditional algorithms for charged particle tracking while improving scalability to meet the computing challenges posed by the HL-LHC. Most GNN…
Across many disciplines from neuroscience and genomics to machine learning, atmospheric science and finance, the problems of denoising large data matrices to recover signals obscured by noise, and of estimating the structure of these…
The incoherent scatter radar (ISR) technique is a powerful remote sensing tool for ionosphere and thermosphere dynamics in the near-Earth space environment. Weak ISR scatter from naturally occurring Langmuir oscillations, or plasma lines,…
We present a novel data format design that obviates the need for data tiers by storing individual event data products in column objects. The objects are stored and retrieved through Ceph S3 technology, with a layout designed to minimize…
The characterization of dynamical processes in living systems provides important clues for their mechanistic interpretation and link to biological functions. Thanks to recent advances in microscopy techniques, it is now possible to…
Based on the work of Klein and Roodman [\cite{JoshK}] we present an alternate conclusion as to the charm of blind analysis in physics experiments.
A new alternative method to approximate the Visibility Graph (VG) of a time series has been introduced here. It exploits the fact that most of the nodes in the resulting network are not connected to those that are far away from them. This…
This article expands the framework of Bayesian inference and provides direct probabilistic methods for approaching inference tasks that are typically handled with information theory. We treat Bayesian probability updating as a random…
Identifying pure components in mixtures is a common yet challenging problem. The associated unmixing process requires the pure components, also known as endmembers, to be sufficiently spectrally distinct. Even with this requirement met,…
Molecular dynamic (MD) simulations are applied to investigate the dependency of the kinetic friction coefficient on the temperature at the nano-scale. The system is comprised of an aluminum spherical particle consisting of 32000 atoms in an…
We performed molecular dynamics (MD) experiments to explore dry sliding friction at the nanoscale. We used the setup comprised of a spherical particle built up of 32,000 aluminium atoms, resting on a semi-space with a free surface, modelled…
We introduce a method for reconstructing macroscopic models of one-dimensional stochastic processes with long-range correlations from sparsely sampled time series by combining fractional calculus and discrete-time Langevin equations. The…
Cumulant mapping employs a statistical reconstruction of the whole by sampling its parts. The theory developed in this work formalises and extends ad hoc methods of `multi-fold' or `multi-dimensional' covariance mapping. Explicit formulae…