数据分析、统计与概率
Deep neural networks have rightfully won the place of one of the most accurate analysis tools in high energy physics. In this paper we will cover several methods of improving the performance of a deep neural network in a classification task…
We explore the application of a Convolutional Neural Network (CNN) to image the shear modulus field of an almost incompressible, isotropic, linear elastic medium in plane strain using displacement or strain field data. This problem is…
Proton imaging is a powerful technique for imaging electromagnetic fields within an experimental volume, in which spatial variations in proton fluence are a result of deflections to proton trajectories due to interaction with the fields.…
The traditional approach in HEP analysis software is to loop over every event and every object via the ROOT framework. This method follows an imperative paradigm, in which the code is tied to the storage format and steps of execution. A…
We generalize the concept of local states (LS) for the prediction of high-dimensional, potentially mixed chaotic systems. The construction of generalized local states (GLS) relies on defining distances between time series on the basis of…
We describe the use of machine learning algorithms to select high-quality measurements for the Mu2e experiment. This technique is important for experiments with backgrounds that arise due to measurement errors. The algorithms use multiple…
Geant4Reweight is an open-source C++ framework that allows users to 1) weight tracks produced by the GEANT4 particle transport Monte Carlo simulation according to hadron interaction cross section variations and 2) estimate uncertainties in…
The manuscript is about an interlaboratory comparison which involved eleven metrology institutes. It comprises four tasks: i) deriving a consensus value from these results; ii) evaluating the associated standard uncertainty; iii) producing…
Probability distributions and densities are derived for the excess and deficiency of the intensity or instantaneous energy (quasi-static power) associated with a $p$-dimensional random vector field. Explicit expressions for the exact…
Inferring dynamics from time series is an important objective in data analysis. In particular, it is challenging to infer stochastic dynamics given incomplete data. We propose an expectation maximization (EM) algorithm that iterates between…
A reformulated implementation of single-sideband ptychography enables analysis and display of live detector data streams in 4D scanning transmission electron microscopy (STEM) using the LiberTEM open-source platform. This is combined with…
With the LHC continuing to collect more data and experimental analyses becoming increasingly complex, tools to efficiently develop and execute these analyses are essential. The bamboo framework defines a domain-specific language, embedded…
For a high source activity experiment, such as HOLMES, non-constant baseline pulses could constitute a great fraction of the data-set. We test the optimal filter matrix technique, proposed to process these pulses, on simulated responses of…
Mixture Density Networks (MDNs) can be used to generate probability density functions of model parameters $\boldsymbol{\theta}$ given a set of observables $\mathbf{x}$. In some applications, training data are available only for discrete…
The spectra of empirical correlation matrices, constructed from multivariate data, are widely used in many areas of sciences, engineering and social sciences as a tool to understand the information contained in typically large datasets. In…
Binned maximum likelihood fits are an attractive option when analysing large datasets, but require care when computing likelihoods of continuous PDFs in bins. For many years the widely used statistical modelling package RooFit evaluated…
This paper is a review of a particular approach to the method of maximum entropy as a general framework for inference. The discussion emphasizes the pragmatic elements in the derivation. An epistemic notion of information is defined in…
Langevin models are frequently used to model various stochastic processes in different fields of natural and social sciences. They are adapted to measured data by estimation techniques such as maximum likelihood estimation, Markov chain…
This paper presents a statistical analysis of federal highway bridges commonly found in Northeastern Brazil to develop a portfolio, or statistically representative characterization of bridges across the region. A detailed study of bridges…
Mix-cumulants of conserved charge distributions are sensitive observables for probing properties of the QCD medium and phase transition in heavy-ion collisions. To perform precise measurements, efficiency correction is one of the most…