数据分析、统计与概率
The decision to incorporate cross-validation into validation processes of mathematical models raises an immediate question - how should one partition the data into calibration and validation sets? We answer this question systematically: we…
Evaluating the neutronic state of the whole nuclear core is a very important topic that have strong implication for nuclear core management and for security monitoring. The core state is evaluated using measurements. Usually, part of the…
Partial wave analysis is a key technique in hadron spectroscopy. The use of unbinned likelihood fits on large statistics data samples and ever more complex physics models makes this analysis technique computationally very expensive.…
We consider the problem of an ensemble Kalman filter when only partial observations are available. In particular we consider the situation where the observational space consists of variables which are directly observable with known…
In this paper we propose numerical measures for evaluating the aesthetic interest of simple patterns. The patterns consist of elements (symbols, pixels, etc.) in regular square arrays. The measures depend on two characteristics of the…
Irreversible aggregation is revisited in view of recent work on renormalization of complex networks. Its scaling laws and phase transitions are related to percolation transitions seen in the latter. We illustrate our points by giving the…
The transition between ballistic and diffusive motion poses difficult problems in several fields of physics. In this work we show how to calculate the spectra of the correlation functions between fields of arbitrary spatial dependence as…
A new optimization procedure for the estimation of Kramers-Moyal coefficients from stationary, one-dimensional, Markovian time series data is presented. The method takes advantage of a recently reported approach that allows to calculate…
We present an advanced interpolation method for estimating smooth spatiotemporal profiles for local highway traffic variables such as flow, speed and density. The method is based on stationary detector data as typically collected by traffic…
The statistical methods used by the ATLAS Collaboration for setting upper limits or establishing a discovery are reviewed, as they are fundamental ingredients in the search for new phenomena. The analyses published so far adopted different…
Imbalanced data sets containing much more background than signal instances are very common in particle physics, and will also be characteristic for the upcoming analyses of LHC data. Following up the work presented at ACAT 2008, we use the…
It is studied the MIT-BIH Normal Sinus Rhythm Database using a statistical technique of analysis, that is based on the Wavelet and Hilbert Transforms. With that technique, it was previously found, that there is a collective and intrinsic…
Percentile ranks and the I3 indicator were introduced by Bornmann, Leydesdorff, Mutz and Opthof. These two notions are based on the concept of percentiles (or quantiles) for discrete data. As several definitions for these notions exist we…
A range of early studies have been conducted to illustrate human mobility patterns using different tracking data, such as dollar notes, cell phones and taxicabs. Here, we explore human mobility patterns based on massive tracking data of US…
Combining measurements which have "theoretical uncertainties" is a delicate matter, due to an unclear statistical basis. We present an algorithm based on the notion that a theoretical uncertainty represents an estimate of bias.
The interpretation of data in terms of multi-parameter models of new physics, using the Bayesian approach, requires the construction of multi-parameter priors. We propose a construction that uses elements of Bayesian reference analysis. Our…
We present an automatic, fast, accurate and robust method of classifying astronomical objects. The Self Organizing Map (SOM) as an unsupervised Artificial Neural Network (ANN) algorithm is used for classification of stellar spectra of…
A n-step Pearson-Gamma random walk in Rd starts at the origin and consists of n independent steps with gamma distributed lengths and uniform orientations. The gamma distribution of each step length has a shape parameter q>0. Constrained…
A new set of parameters to describe the word frequency behavior of texts is proposed. The analogy between the word frequency distribution and the Bose-distribution is suggested and the notion of "temperature" is introduced for this case.…
We review statistical methods used for the search for new physics at LHC.