数据分析、统计与概率
High energy physics experiments are currently recording large amounts of data and in a few years will be recording prodigious quantities of data. New methods must be developed to handle this data and make analysis at universities possible.…
As we move to distribute High Energy Physics computing tasks throughout the global Grid, we are encountering ever more severe difficulties installing and selecting appropriate versions of the supporting products. Problems show up at every…
The ALICE experiment at LHC will detect and identify prompt photons and light neutral-mesons with the PHOS detector and the additional EMCal electromagnetic calorimeter. Charged particles will be detected and identified by the central…
An event reweighting technique incorporated in multivariate training algorithm has been developed and tested using the Artificial Neural Networks (ANN) and Boosted Decision Trees (BDT). The event reweighting training are compared to that of…
A detailed description of an original method used to measure the luminosity accumulated by the HERA-B experiment for a data sample taken during the 2002-2003 HERA running period is reported. We show that, with this method, a total…
We derive conditions under which alternating renewal processes can be used to construct correlated Poisson processes. The pairwise correlation function is also derived, showing that the resulting correlations can be negative. The technique…
We study the statistics of the horizontal component of atmospheric boundary layer wind speed. Motivated by its non-stationarity, we investigate which parameters remain constant or can be regarded as being piece-wise constant and explain how…
Hypothesis tests for the presence of new sources of Poisson counts amidst background processes are frequently performed in high energy physics (HEP), gamma ray astronomy (GRA), and other branches of science. While there are conceptual…
The outbreaks of the severe acute respiratory syndrome (SARS) epidemic in 2003 resulted in unprecedented impacts on people's daily life. One of the most significant impacts to people is the risk of contacting SARS while engaging daily…
There are things we know, things we know we don't know, and then there are things we don't know we don't know. In this paper we address the latter two issues in a Bayesian framework, introducing the notion of doubt to quantify the degree of…
We report an algorithm for the partition of a line segment according to a given ratio $\nu$. At each step the length distribution among sets of the partition follows a binomial distribution. We call $k$-set to the set of elements with the…
VISPA is a novel development environment for high energy physics analyses, based on a combination of graphical and textual steering. The primary aim of VISPA is to support physicists in prototyping, performing, and verifying a data analysis…
Soft-constraint affinity propagation (SCAP) is a new statistical-physics based clustering technique. First we give the derivation of a simplified version of the algorithm and discuss possibilities of time- and memory-efficient…
The main goal of this paper is an application of Bayesian model comparison, based on the posterior probabilities and posterior odds ratios, in testing the explanatory power of the set of competing GARCH (ang. Generalised Autoregressive…
The usual development of the continuous time random walk (CTRW) assumes that jumps and time intervals are a two-dimensional set of independent and identically distributed random variables. In this paper we address the theoretical setting of…
Data clustering, including problems such as finding network communities, can be put into a systematic framework by means of a Bayesian approach. The application of Bayesian approaches to real problems can be, however, quite challenging. In…
I use the example of the Earth's orbit to illustrate the principle behind the Akaike Information Criterion, and refute the misconception that the criterion, by definition, discards more complex models in favour of simpler ones.
We construct an adaptive asymptotically optimal in the classical norm of the space L(2) of square integrable functions non - parametrical multidimensional time defined signal regaining (adaptive filtration, noise canceller) on the…
We present a procedure for reconstructing particle cascades from event data measured in a high energy physics experiment. For evaluating the hypothesis of a specific physics process causing the observed data, all possible reconstruction…
Motivated by the recent demonstration of its use as a tool for the detection and characterization of phase-shape correlations in multivariate time series, we show that eigenvalue decomposition can also be applied to a matrix of indices of…