English

Multivariate Confidence Intervals

Applications 2017-01-23 v1 Machine Learning

Abstract

Confidence intervals are a popular way to visualize and analyze data distributions. Unlike p-values, they can convey information both about statistical significance as well as effect size. However, very little work exists on applying confidence intervals to multivariate data. In this paper we define confidence intervals for multivariate data that extend the one-dimensional definition in a natural way. In our definition every variable is associated with its own confidence interval as usual, but a data vector can be outside of a few of these, and still be considered to be within the confidence area. We analyze the problem and show that the resulting confidence areas retain the good qualities of their one-dimensional counterparts: they are informative and easy to interpret. Furthermore, we show that the problem of finding multivariate confidence intervals is hard, but provide efficient approximate algorithms to solve the problem.

Keywords

Cite

@article{arxiv.1701.05763,
  title  = {Multivariate Confidence Intervals},
  author = {Jussi Korpela and Emilia Oikarinen and Kai Puolamäki and Antti Ukkonen},
  journal= {arXiv preprint arXiv:1701.05763},
  year   = {2017}
}

Comments

A short version of this paper appeared in the 2017 SIAM International Conference on Data Mining, SDM'17. This extended version contains proofs of theorems in the appendix