数据分析、统计与概率
Maximum likelihood fits to data can be done using binned data (histograms) and unbinned data. With binned data, one gets not only the fitted parameters but also a measure of the goodness of fit. With unbinned data, currently, the fitted…
Unbinned likelihood fits are frequent in Physics, and often involve complex functions with several components. We discuss the potential pitfalls of situations where the templates used in the fit are not fixed but depend on the event…
Errors quoted on results are often given in asymmetric form. An account is given of the two ways these can arise in an analysis, and the combination of asymmetric errors is discussed. It is shown that the usual method has no basis and is…
A review of the blind analysis technique, as used in particle physics measurements, is presented. The history of blind analyses in physics is briefly discussed. Next the dangers of "experimenter's bias" and the advantages of a blind…
The frequentist interpretation of measurement results requires the specification of an ensemble of independent replications of the same experiment. For complex calculations of bias, coverage, significance, etc., this ensemble is often…
I compare and discuss critically several measures of statistical significance in common use in astrophysics and in high energy physics. I also exhibit some relationships among them.
We examine computational, conceptual, and philosophical issues in moving the statistical techniques used in the LEP Higgs working group to the LHC.
We describe a test statistic for unbinned goodness-of-fit of data in one dimension. The statistic is based on the two-dimensional Random Walk. The rejection power of this test is explored both for simple and compound hypotheses and, for the…
An account is given of the methods of working of Experimental High Energy Particle Physics, from the viewpoint of statisticians and others unfamiliar with the field. Current statistical problems, techniques, and hot topics are introduced…
The value of the likelihood is occasionally used by high energy physicists as a statistic to measure goodness-of-fit in unbinned maximum likelihood fits. Simple examples are presented that illustrate why this (seemingly intuitive) method…
Multivariate Analysis is an increasingly common tool in experimental high energy physics; however, many of the common approaches were borrowed from other fields. We clarify what the goal of a multivariate algorithm should be for the search…
We consider the standard Neyman-Pearson hypothesis test of a signal-plus-background hypothesis and background-only hypothesis in the presence of uncertainty on the background-only prediction. Surprisingly, this problem has not been…
The class framework developed for vertex reconstruction in CMS is described. We emphasize how we proceed to develop a flexible, efficient and reliable piece of reconstruction software. We describe the decomposition of the algorithms into…
Currently available satellite active fire detection products from the VIIRS and MODIS instruments on polar-orbiting satellites produce detection squares in arbitrary locations. There is no global fire/no fire map, no detection under cloud…
In a search scenario, nuclear background spectra are continuously measured in short acquisition intervals with a mobile detector-spectrometer. Detecting sources from measured data is difficult because of low signal to noise ratio (S/N) of…
The entropic form $S_q$ is, for any $q \neq 1$, {\it nonadditive}. Indeed, for two probabilistically independent subsystems, it satisfies $S_q(A+B)/k=[S_q(A)/k]+[S_q(B)/k]+(1-q)[S_q(A)/k][S_q(B)/k] \ne S_q(A)/k+S_q(B)/k$. This form will…
The formation and accretion of ice on the leading edge of a wing can be detrimental to airplane performance. Complicating this reality is the fact that even a small amount of uncertainty in the shape of the accreted ice may result in a…
When fitting theory to data in the presence of background uncertainties, the question of whether the spectral shape of the background happens to be similar to that of the theoretical model of physical interest has not generally been…
Using the latest numerical simulations of rotating stellar core collapse, we present a Bayesian framework to extract the physical information encoded in noisy gravitational wave signals. We fit Bayesian principal component regression models…
A very simple heuristic approach to the unfolding problem will be described. An iterative algorithm starts with an empty histogram and every iteration aims to add one entry to this histogram. The entry to be added is selected according to a…