数据分析、统计与概率
An algorithm for optimization of signal significance or any other classification figure of merit suited for analysis of high energy physics (HEP) data is described. This algorithm trains decision trees on many bootstrap replicas of training…
Predictive skill of complex models is often not uniform in model-state space; in weather forecasting models, for example, the skill of the model can be greater in populated regions of interest than in "remote" regions of the globe. Given a…
Point pattern sets arise in many different areas of physical, biological, and applied research, representing many random realizations of underlying pattern formation mechanisms. These pattern sets can be heterogeneous with respect to…
This manuscript addresses selected aspects of computing for the reconstruction and simulation of particle interactions in subnuclear physics. Based on personal experience with experiments at DESY and at CERN, I cover the evolution of…
Networks are a natural and popular mechanism for the representation and investigation of a broad class of systems. But extracting information from a network can present significant challenges. We present NetzCope, a software application for…
A track finding algorithm has been developed for reconstruction of e+e- pairs. It combines the information of the electromagnetic calorimeter with the information provided by the Tracker. Results on reconstruction efficiency of converted…
We present the first world-wide inter-laboratory comparison of small-angle X-ray scattering (SAXS) for nanoparticle sizing. The measurands in this comparison are the mean particle radius, the width of the size distribution and the particle…
A computational method is presented which efficiently segments digital grayscale images by directly applying the Q-state Ising (or Potts) model. Since the Potts model was first proposed in 1952, physicists have studied lattice models to…
Mechanistic dynamic models of biochemical networks such as Ordinary Differential Equations (ODEs) contain unknown parameters like the reaction rate constants and the initial concentrations of the compounds. The large number of parameters as…
The main goal of the paper is to develop an estimate for the conditional probability function of random stationary ergodic symbolic sequences with elements belonging to a finite alphabet. We elaborate a decomposition procedure for the…
In this article we present very intuitive, easy to follow, yet mathematically rigorous, approach to the so called data fitting process. Rather than minimizing the distance between measured and simulated data points, we prefer to find such…
Accelerators and detectors are expensive, both in terms of money and human effort. It is thus important to invest effort in performing a good statistical analysis of the data, in order to extract the best information from it. This series of…
The analysis of data sets arising from multiple sensors has drawn significant research attention over the years. Traditional methods, including kernel-based methods, are typically incapable of capturing nonlinear geometric structures. We…
We study the inverse problem of localization (imaging) of a laser beam from measurements of the intensity of light scattered off-axis by a Poisson cloud of small particles. Starting from the wave equation, we analyze the microscopic…
There is growing evidence that cognitive processes may have fractal structures as a signature of complexity. It is an an ongoing topic of research to study the class of complexity and how it may differ as a function of cognitive variables.…
The Pseudo-Random Number Generators (PRNGs) are key tools in Monte Carlo simulations. More recently, the MIXMAX PRNG has been included in ROOT and Class Library for High Energy Physics (CLHEP) software packages and claims to be a state of…
We consider travel time tomography problems involving detection of high contrast, discrete high velocity structures. This results in a discrete nonlinear inverse problem, for which traditional grid-based models and iterative linearized…
We introduce a kernel Lasso (kLasso) optimization that simultaneously accounts for spatial regularity and network sparsity to reconstruct spatial complex networks from data. Through a kernel function, the proposed approach exploits spatial…
We derive formulas for the efficiency correction of cumulants with many efficiency bins. The derivation of the formulas is simpler than the previously suggested method, but the numerical cost is drastically reduced from the naive method.…
We revisit the precision of the measurement of track parameters (position, angle) with optimal methods in the presence of detector resolution, multiple scattering and zero magnetic field. We then obtain an optimal estimator of the track…