数据分析、统计与概率
The connection between random matrices and the spectral fluctuations of complex quantum systems in a suitable limit can be explained by using the setup of random matrix theory. Higher-order spacing statistics in the $m$ superposed spectra…
A frequently occurring challenge in experimental and numerical observation is how to resolve features, such as spectral peaks - with center, width, height - and derivatives from measured data with unavoidable noise. Therefore, we develop a…
Modern machine learning has enabled parameter inference from event-level data without the need to first summarize all events with a histogram. All of these unbinned inference methods make use of the fact that the events are statistically…
Mutual information (MI) is a fundamental measure of statistical dependence between two variables, yet accurate estimation from finite data remains notoriously difficult. No estimator is universally reliable, and common approaches fail in…
A novel numerical technique is presented to transform one random variable within a system toward statistical quasi-independence from any other random variable in the system. The method's applicability is demonstrated through a particle…
We assume that a sufficiently large database is available, where a physical property of interest and a number of associated ruling primitive variables or observables are stored. We introduce and test two machine learning approaches to…
Since its publication in 1949, Langbein's formula has been applied ubiquitously in both research documents and national guidelines concerning frequency analyses (FAs) of hydrologic extremes. Given a time series of independent…
The detection of a signal variable from multiple variables that contain many noise variables is often approached as a variable selection problem under a given objective variable. This is nothing more than building a supervised model of a…
Frank Porter has recently posted a review of "Confidence intervals for the Poisson distribution" (arXiv:2509.02852). The long, diverse history of such intervals is closely related to that of confidence intervals for the parameter of the…
It is shown how the optimal detector of Gaussian signals can be represented in terms of Bertrand's class of time-frequency distributions. In this representation, the detector is a correlation between the corresponding time-frequency…
Over the last several decades the Shannon-Nyquist criterion has been widely used as a measure of the maximum information content in EXAFS spectra and provided an upper limit on the number of parameters used in fitting data. However, the…
Scientific foundation models hold great promise for advancing nuclear and particle physics by improving analysis precision and accelerating discovery. Yet, progress in this field is often limited by the lack of openly available large scale…
As the particle physics community needs higher and higher precisions in order to test our current model of the subatomic world, larger and larger datasets are necessary. With upgrades scheduled for the detectors of colliding-beam…
The study of the dynamics of natural and artificial systems has provided several examples of deviations from Brownian behavior, generally defined as anomalous diffusion. The investigation of these dynamics can provide a better understanding…
Covariance localization is a critical component of ensemble-based data assimilation (DA) and many current localization schemes simply dampen correlations as a function of distance. Increases in computational resources, broadening scope of…
We propose a method of estimating the uncertainty of a result obtained through extrapolation to the complete basis set limit. The method is based on an ensemble of random walks which simulate all possible extrapolation outcomes that could…
Deviations from Brownian motion leading to anomalous diffusion are found in transport dynamics from quantum physics to life sciences. The characterization of anomalous diffusion from the measurement of an individual trajectory is a…
Detectors in next-generation high-energy physics experiments face several daunting requirements, such as high data rates, damaging radiation exposure, and stringent constraints on power, space, and latency. To address these challenges,…
Jet classification in high-energy particle physics is important for understanding fundamental interactions and probing phenomena beyond the Standard Model. Jets originate from the fragmentation and hadronization of quarks and gluons, and…
The visibility graph (VG) algorithm and its variants have been extensively studied in the time series analysis as they transform the time series into the network of nodes and links, enabling to characterize the time series in terms of…