Related papers: A Tutorial on Multivariate $k$-Statistics and thei…

A new method for fast computing unbiased estimators of cumulants

We propose new algorithms for generating $k$-statistics, multivariate $k$-statistics, polykays and multivariate polykays. The resulting computational times are very fast compared with procedures existing in the literature. Such speeding up…

Statistics Theory · Mathematics 2008-08-01 E. Di Nardo , G. Guarino , D. Senato

kStatistics: Unbiased Estimates of Joint Cumulant Products from the Multivariate Fa\`a Di Bruno's Formula

kStatistics is a package in R that serves as a unified framework for estimating univariate and multivariate cumulants as well as products of univariate and multivariate cumulants of a random sample, using unbiased estimators with minimum…

Computation · Statistics 2022-07-01 E. Di Nardo , G. Guarino

A unifying framework for $k$-statistics, polykays and their multivariate generalizations

Through the classical umbral calculus, we provide a unifying syntax for single and multivariate $k$-statistics, polykays and multivariate polykays. From a combinatorial point of view, we revisit the theory as exposed by Stuart and Ord,…

Combinatorics · Mathematics 2008-05-19 Elvira Di Nardo , Giuseppe Guarino , Domenico Senato

A symbolic method for k-statistics

Trough the classical umbral calculus, we provide new, compact and easy to handle expressions of k-statistics, and more in general of U-statistics. In addition such a symbolic method can be naturally extended to multivariate case and to…

Combinatorics · Mathematics 2007-06-13 E. Di Nardo , D. Senato

Ready-to-Use Unbiased Estimators for Multivariate Cumulants Including One That Outperforms $\overline{x^3}$

We present multivariate unbiased estimators for second, third, and fourth order cumulants $C_2(x,y)$, $C_3(x,y,z)$, and $C_4(x,y,z,w)$. Many relevant new estimators are derived for cases where some variables are average-free or pairs of…

Statistics Theory · Mathematics 2019-04-30 Fabian Schefczik , Daniel Hägele

$k$-Variance: A Clustered Notion of Variance

We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $K$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local…

Statistics Theory · Mathematics 2020-12-15 Justin Solomon , Kristjan Greenewald , Haikady N. Nagaraja

Efficient computation of complementary set partitions, with applications to an extension and estimation of generalized cumulants

This paper develops new combinatorial approaches to analyze and compute special set partitions, called complementary set partitions, which are fundamental in the study of generalized cumulants. Moving away from traditional graph-based and…

Statistics Theory · Mathematics 2025-05-20 Elvira Di Nardo , Giuseppe Guarino

Big-Data Clustering: K-Means or K-Indicators?

The K-means algorithm is arguably the most popular data clustering method, commonly applied to processed datasets in some "feature spaces", as is in spectral clustering. Highly sensitive to initializations, however, K-means encounters a…

Machine Learning · Computer Science 2019-06-04 Feiyu Chen , Yuchen Yang , Liwei Xu , Taiping Zhang , Yin Zhang

Time-varying clustering of multivariate longitudinal observations

We propose a statistical method for clustering of multivariate longitudinal data into homogeneous groups. This method relies on a time-varying extension on the classical K-means algorithm, where a multivariate vector autoregressive model is…

Methodology · Statistics 2014-04-25 Antonello Maruotti , Maurizio Vichi

Multivariate boundary regression models

In this work, we consider a multivariate regression model with one-sided errors. We assume for the regression function to lie in a general H\"{o}lder class and estimate it via a nonparametric local polynomial approach that consists of…

Statistics Theory · Mathematics 2021-02-11 Leonie Selk , Charles Tillier , Orlando Marigliano

Moment-based Density Elicitation with Applications in Probabilistic Loops

We propose the K-series estimation approach for the recovery of unknown univariate and multivariate distributions given knowledge of a finite number of their moments. Our method is directly applicable to the probabilistic analysis of…

Methodology · Statistics 2025-04-15 Andrey Kofnov , Ezio Bartocci , Efstathia Bura

Statistical and knowledge supported visualization of multivariate data

In the present work we have selected a collection of statistical and mathematical tools useful for the exploration of multivariate data and we present them in a form that is meant to be particularly accessible to a classically trained…

Statistics Theory · Mathematics 2010-09-01 Magnus Fontes

A Linear Time, and Constant Space, Algorithm to Compute the Mixed Moments of the Multivariate Normal Distributions

Using recurrences gotten from the Apagodu-Zeilberger Multivariate Almkvist-Zeilberger algorithm we present a linear-time, and constant-space, algorithm to compute the general mixed moments of the k-variate general normal distribution, with…

Combinatorics · Mathematics 2022-02-22 Shalosh B. Ekhad , Doron Zeilberger

Tutorial: Introduction to computational causal inference using reproducible Stata, R and Python code

The purpose of many health studies is to estimate the effect of an exposure on an outcome. It is not always ethical to assign an exposure to individuals in randomised controlled trials, instead observational data and appropriate study…

Methodology · Statistics 2020-12-22 Matthew J. Smith , Camille Maringe , Bernard Rachet , Mohammad A. Mansournia , Paul N. Zivich , Stephen R. Cole , Miguel Angel Luque-Fernandez

A Scalable Formula for the Moments of a Family of Self-Normalized Statistics

Following the student t-statistic, normalization has been a widely used method in statistic and other disciplines including economics, ecology and machine learning. We focus on statistics taking the form of a ratio over (some power of) the…

Statistics Theory · Mathematics 2025-09-19 Haolin Zou , Heyuan Yao , Victor de la Peña

Explaining Time Series Predictions with Dynamic Masks

How can we explain the predictions of a machine learning model? When the data is structured as a multivariate time series, this question induces additional difficulties such as the necessity for the explanation to embody the time dependency…

Machine Learning · Computer Science 2021-06-11 Jonathan Crabbé , Mihaela van der Schaar

k-sums: another side of k-means

In this paper, the decades-old clustering method k-means is revisited. The original distortion minimization model of k-means is addressed by a pure stochastic minimization procedure. In each step of the iteration, one sample is tentatively…

Machine Learning · Computer Science 2020-05-20 Wan-Lei Zhao , Run-Qing Chen , Hui Ye , Chong-Wah Ngo

Cumulant mapping as the basis of multi-dimensional spectrometry

Cumulant mapping employs a statistical reconstruction of the whole by sampling its parts. The theory developed in this work formalises and extends ad hoc methods of `multi-fold' or `multi-dimensional' covariance mapping. Explicit formulae…

Data Analysis, Statistics and Probability · Physics 2023-11-06 Leszek J. Frasinski

Unbiased Monte Carlo: posterior estimation for intractable/infinite-dimensional models

We provide a general methodology for unbiased estimation for intractable stochastic models. We consider situations where the target distribution can be written as an appropriate limit of distributions, and where conventional approaches…

Methodology · Statistics 2014-12-01 Sergios Agapiou , Gareth O. Roberts , Sebastian J. Vollmer

On the estimation of complex statistics combining different surveys

The importance of exploring a potential integration among surveys has been acknowledged in order to enhance effectiveness and minimize expenses. In this work, we employ the alignment method to combine information from two different surveys…

Methodology · Statistics 2024-04-09 Vasilis Chasiotis , Dimitris Karlis