数据分析、统计与概率
The ALPHA-g experiment at CERN aims to perform the first-ever direct measurement of the effect of gravity on antimatter, determining its weight to within 1% precision. This measurement requires an accurate prediction of the vertical…
This study evaluates the performance of analog-based methodologies to predict, in a statistical way, the longitudinal velocity in a turbulent flow. The data used comes from hot wire experimental measurements from the Modane wind tunnel. We…
Correcting for detector effects in experimental data, particularly through unfolding, is critical for enabling precision measurements in high-energy physics. However, traditional unfolding methods face challenges in scalability,…
We introduce linear regression using physics-based basis functions optimized through the geometry of an inner product space. This method addresses the challenge of surrogate modeling with high-dimensional input, as the physics-based basis…
Complex systems are typically characterized by intricate internal dynamics that are often hard to elucidate. Ideally, this requires methods that allow to detect and classify in unsupervised way the microscopic dynamical events occurring in…
Detector simulation and reconstruction are a significant computational bottleneck in particle physics. We develop Particle-flow Neural Assisted Simulations (Parnassus) to address this challenge. Our deep learning model takes as input a…
This paper describes the COMBINE software package used for statistical analyses by the CMS Collaboration. The package, originally designed to perform searches for a Higgs boson and the combined analysis of those searches, has evolved to…
The sample covariance matrix of a random vector is a good estimate of the true covariance matrix if the sample size is much larger than the length of the vector. In high-dimensional problems, this condition is never met. As a result, in…
Dynamical system state estimation and parameter calibration problems are ubiquitous across science and engineering. Bayesian approaches to the problem are the gold standard as they allow for the quantification of uncertainties and enable…
We introduce an approach for analyzing the responses of dynamical systems to external perturbations that combines score-based generative modeling with the Generalized Fluctuation-Dissipation Theorem (GFDT). The methodology enables accurate…
Statistical hypothesis testing is the central method to demarcate scientific theories in both exploratory and inferential analyses. However, whether this method befits such purpose remains a matter of debate. Established approaches to…
The Jaccard similarity index has often been employed in science and technology as a means to quantify the similarity between two sets. When modified to operate on real-valued values, the Jaccard similarity index can be applied to compare…
In this work we present the wavScalogram R package, which contains methods based on wavelet scalograms for time series analysis. These methods are related to two main wavelet tools: the windowed scalogram difference and the scale index. The…
The cost of writing, transferring, and storing large data from unsteady simulations limits access to the entire solution, often leaving much of the flow under-sampled or unanalyzed. For example, modeling transient behavior of rare dynamic…
The arithmetic mean plays a central role in science and technology, being directly related to the concepts of statistical expectance and centrality. Yet, it is highly susceptible to the presence of outliers or biased interference in the…
Complex systems can undergo critical transitions, where slowly changing environmental conditions trigger a sudden shift to a new, potentially catastrophic state. Early warning signals for these events are crucial for decision-making in…
Reconstructing the electric field from the measured voltages in an antenna, unfolding the antenna response, comes with several problems. Due to the noisiness of the signal it is often necessary to disregard part of the bandwidth of the…
We investigate the reconstruction of time series from dynamical networks that are partially observed. In particular, we address the extent to which the time series at a node of the network can be successfully reconstructed when measuring…
Physicists routinely need probabilistic models for a number of tasks such as parameter inference or the generation of new realizations of a field. Establishing such models for highly non-Gaussian fields is a challenge, especially when the…
Many real-world systems undergo abrupt changes in dynamics as they move across critical points, often with dramatic consequences. Much existing theory on identifying the time-series signatures of nearby critical points -- such as increased…