数据分析、统计与概率
Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the…
Nanopore resistive pulse techniques are based on analysis of current or voltage spikes in the recorded signal. These spikes result from translocation of nanometer sized analytes through a nanopore. The most important information that needs…
In the context of the sensational results concerning superluminal velocities, announced recently by the OPERA Collaboration, we have proposed a classical model yielding a statistically calculated measured velocity of a beam, higher than the…
The ATLAS detector at CERN has completed its first full year of recording collisions at 7 TeV, resulting in billions of events and petabytes of data. At these scales, physicists must have the capability to read only the data of interest to…
In complex networks it is common for each node to belong to several communities, implying a highly overlapping community structure. Recent advances in benchmarking indicate that existing community assignment algorithms that are capable of…
Methods to extract information from the tracking of mobile objects/particles have broad interest in biological and physical sciences. Techniques based on simple criteria of proximity in time-consecutive snapshots are useful to identify the…
We use the Detrended Cross-Correlation Analysis (DCCA) to investigate the influence of sun activity represented by sunspot numbers on one of the climate indicators, specifically rivers, represented by river flow fluctuation for Daugava,…
In a recent paper [2] the author introduced and investigated a random walk model similar to a model introduced in [1]. In these models the increment of the random walk depends on the complete past of the process. In this note I will point…
A procedure based on a Mixture Density Model for correcting experimental data for distortions due to finite resolution and limited detector acceptance is presented. Addressing the case that the solution is known to be non-negative, in the…
We introduce a new kind of likelihood function based on the sequence of moments of the data distribution. Both binned and unbinned data samples are discussed, and the multivariate case is also derived. Building on this approach we lay out…
This paper presents analysis of 30 literary texts written in English by different authors. For each text, there were created time series representing length of sentences in words and analyzed its fractal properties using two methods of…
Long-range correlated processes are ubiquitous, ranging from climate variables to financial time series. One paradigmatic example for such processes is fractional Brownian motion (fBm). In this work, we highlight the potentials and…
The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration uncertainty renormalized estimator, to reconstruct a signal and simultaneously…
We review the state space decomposition techniques for the assessment of the noise properties of autonomous oscillators, a topic of great practical and theoretical importance for many applications in many different fields, from electronics,…
The one-dimensional random trap model with a power-law distribution of mean sojourn times exhibits a phenomenon of dynamical localization in the case where diffusion is anomalous: The probability to find two independent walkers at the same…
Due to its constrained support, the Dirichlet distribution is uniquely suited to many applications. The constraints that make it powerful, however, can also hinder practical implementations, particularly those utilizing Markov Chain Monte…
Nested sampling is a powerful approach to Bayesian inference ultimately limited by the computationally demanding task of sampling from a heavily constrained probability distribution. An effective algorithm in its own right, Hamiltonian…
Kernel density estimation is a convenient way to estimate the probability density of a distribution given the sample of data points. However, it has certain drawbacks: proper description of the density using narrow kernels needs large data…
Signal processing techniques have been developed that use different strategies to bypass the Nyquist sampling theorem in order to recover more information than a traditional discrete Fourier transform. Here we examine three such methods:…
We describe a simple stochastic method, so-called Langevin approach, which enables one to extract evolution equations of stochastic variables from a set of measurements. Our method is parameter-free and it is based on the nonlinear Langevin…