Related papers: Parametric Modelling of Multivariate Count Data Us…
Count data take on non-negative integer values and are challenging to properly analyze using standard linear-Gaussian methods such as linear regression and principal components analysis. Generalized linear models enable direct modeling of…
Motivated by the need, in some Bayesian likelihood free inference problems, of imputing a multivariate counting distribution based on its vector of means and variance-covariance matrix, we define a generic multivariate discrete…
Clustering multivariate data is a pervasive task in many applied problems, particularly in social studies and life science. Model-based approaches to clustering rely on mixture models, where each mixture component corresponds to the kernel…
The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world…
We consider deep multivariate models for heterogeneous collections of random variables. In the context of computer vision, such collections may e.g. consist of images, segmentations, image attributes, and latent variables. When developing…
Although multivariate count data are routinely collected in many application areas, there is surprisingly little work developing flexible models for characterizing their dependence structure. This is particularly true when interest focuses…
We introduce the concept of pattern graphs--directed acyclic graphs representing how response patterns are associated. A pattern graph represents an identifying restriction that is nonparametrically identified/saturated and is often a…
Multi-dimensional data frequently occur in many different fields, including risk management, insurance, biology, environmental sciences, and many more. In analyzing multivariate data, it is imperative that the underlying modelling…
Considering discrete models, the univariate framework has been studied in depth compared to the multivariate one. This paper first proposes two criteria to define a sensu stricto multivariate discrete distribution. It then introduces the…
Categorical random variables are a common staple in machine learning methods and other applications across disciplines. Many times, correlation within categorical predictors exists, and has been noted to have an effect on various algorithm…
Heterogeneous data from multiple populations, sub-groups, or sources is often represented as a ``mixture model'' with a single latent class influencing all of the observed covariates. Heterogeneity can be resolved at multiple levels by…
Probabilistic graphical modeling is a branch of machine learning that uses probability distributions to describe the world, make predictions, and support decision-making under uncertainty. Underlying this modeling framework is an elegant…
This paper considers the problem of multi-sample nonparametric comparison of counting processes with panel count data, which arise naturally when recurrent events are considered. Such data frequently occur in medical follow-up studies and…
Multi-category data arise in diverse fields including marketing, chemistry, public policy, genomics, political science, and ecology. We consider the problem of estimating ratios of category-specific means in a fully nonparametric setting,…
We study the spread of information on multi-type directed random graphs. In such graphs the vertices are partitioned into distinct types (communities) that have different transmission rates between themselves and with other types. We…
We review autoregressive models for the analysis of multivariate count time series. In doing so, we discuss the choice of a suitable distribution for a vectors of count random variables. This review focus on three main approaches taken for…
The problem of overdispersion in multivariate count data is a challenging issue. Nowadays, it covers a central role mainly due to the relevance of modern technologies data, such as Next Generation Sequencing and textual data from the web or…
Learning a parametric model from a given dataset indeed enables to capture intrinsic dependencies between random variables via a parametric conditional probability distribution and in turn predict the value of a label variable given…
A geometric representation for multivariate extremes, based on the shapes of scaled sample clouds in light-tailed margins and their so-called limit sets, has recently been shown to connect several existing extremal dependence concepts.…
Species sampling processes have long served as the fundamental framework for modeling random discrete distributions and exchangeable sequences. However, data arising from distinct but related sources require a broader notion of…