数据分析、统计与概率
The continuous time random walk model plays an important role in modeling of so called anomalous diffusion behaviour. One of the specific property of such model are constant time periods visible in trajectory. In the continuous time random…
Characterizing and controlling nonlinear, multi-scale phenomena play important roles in science and engineering. Cluster-based reduced-order modeling (CROM) was introduced to exploit the underlying low-dimensional dynamics of complex…
A typical experiment in high energy physics is considered. The result of the experiment is assumed to be a histogram consisting of bins or channels with numbers of corresponding registered events. The expected background and expected signal…
Berliner (Likelihood and Bayesian prediction for chaotic systems, J. Am. Stat. Assoc. 1991) identified a number of difficulties in using the likelihood function within the Bayesian paradigm which arise both for state estimation and for…
The heuristic identification of peaks from noisy complex spectra often leads to misunderstanding of the physical and chemical properties of matter. In this paper, we propose a framework based on Bayesian inference, which enables us to…
The neural network-based approach, presented in this paper, was developed for the analysis of peak profiles and for the prediction of base profile characteristics, such as width, asymmetry, asymptotic ("peak tales"), etc. of the observed…
We present a new method for detecting superdiffusive behaviour and for determining rates of superdiffusion in time series data. Our method applies equally to stochastic and deterministic time series data (with no prior knowledge required of…
Multivariate goodness-of-fit and two-sample tests are important components of many nuclear and particle physics analyses. While a variety of powerful methods are available if the dimensionality of the feature space is small, such tests…
We present a weighted estimator of the covariance and correlation in bipartite complex systems with a double layer of heterogeneity. The advantage provided by the weighted estimators lies in the fact that the unweighted sample covariance…
We review the concept of support vector machines (SVMs) and discuss examples of their use. One of the benefits of SVM algorithms, compared with neural networks and decision trees is that they can be less susceptible to over fitting than…
Machine learning tools are commonly used in modern high energy physics (HEP) experiments. Different models, such as boosted decision trees (BDT) and artificial neural networks (ANN), are widely used in analyses and even in the software…
Linearity is an important and frequently sought property in electronics and instrumentation. Here, we report a method capable of, given a transfer function, identifying the respective most linear region of operation with a fixed width. This…
The concept of sequential visibility graph motifs -subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series- has been advanced recently along with a theoretical framework to compute analytically…
The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at the frontier of High Energy Physics, searching for new phenomena and making discoveries. Even though computing plays a significant role in…
The horizontal visibility algorithm has been recently introduced as a mapping between time series and networks. The challenge lies in characterizing the structure of time series (and the processes that generated those series) using the…
The method of surrogates is widely used in the field of nonlinear data analysis for testing for weak nonlinearities. The two most commonly used algorithms for generating surrogates are the amplitude adjusted Fourier transform (AAFT) and the…
We extensively study the rotational group structure inside the patch space by introducing the fiber bundle structure. The rotational group structure leads to a new image denoising algorithm called the \textit{vector non-local Euclidean…
A new ensemble filter that allows for the uncertainty in the prior distribution is proposed and tested. The filter relies on the conditional Gaussian distribution of the state given the model-error and predictability-error covariance…
We present a method for the nonparametric estimation of the drift function of certain types of stochastic differential equations from the empirical density. It is based on a variational formulation of the Fokker-Planck equation. The…
The search for new significant peaks over a energy spectrum often involves a statistical multiple hypothesis testing problem. Separate tests of hypothesis are conducted at different locations producing an ensemble of local p-values, the…