数据分析、统计与概率
100 years after Smoluchowski introduces his approach to stochastic processes, they are now at the basis of mathematical and physical modeling in cellular biology: they are used for example to analyse and to extract features from large…
We formulate a reduced-order strategy for efficiently forecasting complex high-dimensional dynamical systems entirely based on data streams. The first step of our method involves reconstructing the dynamics in a reduced-order subspace of…
A problem of a new physical model test given observed experimental data is a typical one for modern experiments of high energy physics (HEP). A solution of the problem may be provided with two alternative statistical formalisms, namely…
Various methods have been developed independently to study the multifractality of measures in many different contexts. Although they all convey the same intuitive idea of giving a "dimension" to sets where a quantity scales similarly within…
A Bayesian approach is proposed for pulse shape discrimination of photons and neutrons in liquid organic scinitillators. Instead of drawing a decision boundary, each pulse is assigned a photon or neutron confidence probability. This allows…
We propose a 2D generalization to the $M$-band case of the dual-tree decomposition structure (initially proposed by N. Kingsbury and further investigated by I. Selesnick) based on a Hilbert pair of wavelets. We particularly address…
In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is…
Searches for faint signals in counting experiments are often encountered in particle physics and astrophysics, as well as in other fields. Many problems can be reduced to the case of a model with independent and Poisson-distributed signal…
The so-called "optimal filter" analysis of a microcalorimeter's x-ray pulses is statistically optimal only if all pulses have the same shape, regardless of energy. The shapes of pulses from a nonlinear detector can and do depend on the…
For experiments with high arrival rates, reliable identification of nearly-coincident events can be crucial. For calorimetric measurements to directly measure the neutrino mass such as HOLMES, unidentified pulse pile-ups are expected to be…
The Huang-Hilbert transform is applied to Seismic Electric Signal (SES) activities in order to decompose them into a number of Intrinsic Mode Functions (IMFs) and study which of these functions better represent the SES. The results are…
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the…
The implicit particle filter is a sequential Monte Carlo method for data assimilation that guides the particles to the high-probability regions via a sequence of steps that includes minimizations. We present a new and more general…
On-line data assimilation techniques such as ensemble Kalman filters and particle filters lose accuracy dramatically when presented with an unlikely observation. Such an observation may be caused by an unusually large measurement error or…
We present a new cycle flow based method for finding fuzzy partitions of weighted directed networks coming from time series data. We show that this method overcomes essential problems of most existing clustering approaches, which tend to…
We describe a new method of 3D image reconstruction of neutron sources that emit correlated gammas (e.g. Cf-252, Am-Be). This category includes a vast majority of neutron sources important in nuclear threat search, safeguards and…
In this report, we applied expectation and maximization (EM) method described by Philips et al [1] to recover two-dimensional (2D) structure from multiple sparse signal images in random orientation. The detailed derivation of EM algorithm…
Notes for a Course on Probability and Statistics: L1: Elements of Probability; L2: Bayesian Inference; L3: Monte Carlo Methods
It is a widely accepted fact that the computational capability of recurrent neural networks is maximized on the so-called "edge of criticality". Once the network operates in this configuration, it performs efficiently on a specific…
Many real time-series exhibit behavior adequate to long range dependent data. Additionally very often these time-series have constant time periods and also have characteristics similar to Gaussian processes although they are not Gaussian.…