English

Machine Learning on sWeighted Data

High Energy Physics - Experiment 2020-08-26 v1 Machine Learning Data Analysis, Statistics and Probability Machine Learning

Abstract

Data analysis in high energy physics has to deal with data samples produced from different sources. One of the most widely used ways to unfold their contributions is the sPlot technique. It uses the results of a maximum likelihood fit to assign weights to events. Some weights produced by sPlot are by design negative. Negative weights make it difficult to apply machine learning methods. The loss function becomes unbounded. This leads to divergent neural network training. In this paper we propose a mathematically rigorous way to transform the weights obtained by sPlot into class probabilities conditioned on observables, thus enabling to apply any machine learning algorithm out-of-the-box.

Keywords

Cite

@article{arxiv.1912.02590,
  title  = {Machine Learning on sWeighted Data},
  author = {Maxim Borisyak and Nikita Kazeev},
  journal= {arXiv preprint arXiv:1912.02590},
  year   = {2020}
}

Comments

Submitted to Journal of Physics: Conference Series (ACAT-2019 proceedings)