Related papers: Cluster-Specific Predictions with Multi-Task Gauss…
A novel multi-task Gaussian process (GP) framework is proposed, by using a common mean process for sharing information across tasks. In particular, we investigate the problem of time series forecasting, with the objective to improve…
Standard Gaussian Process (GP) regression, a powerful machine learning tool, is computationally expensive when it is applied to large datasets, and potentially inaccurate when data points are sparsely distributed in a high-dimensional…
A Gaussian process has been one of the important approaches for emulating computer simulations. However, the stationarity assumption for a Gaussian process and the intractability for large-scale dataset limit its availability in practice.…
Aggregated data is commonplace in areas such as epidemiology and demography. For example, census data for a population is usually given as averages defined over time periods or spatial resolutions (cities, regions or countries). In this…
We present a Gaussian Process - Latent Class Choice Model (GP-LCCM) to integrate a non-parametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based algorithms that…
Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful…
Grouping observations into homogeneous groups is a recurrent task in statistical data analysis. We consider Gaussian Mixture Models, which are the most famous parametric model-based clustering method. We propose a new robust approach for…
Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among…
Gaussian process is an indispensable tool in clustering functional data, owing to it's flexibility and inherent uncertainty quantification. However, when the functional data is observed over a large grid (say, of length $p$), Gaussian…
Gaussian processes (GPs) are pervasive in functional data analysis, machine learning, and spatial statistics for modeling complex dependencies. Modern scientific data sets are typically heterogeneous and often contain multiple known…
In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the…
Gaussian process (GP) regression is a powerful probabilistic modeling technique with built-in uncertainty quantification. When one has access to multiple correlated simulations (tasks), it is common to fit a multitask GP (MTGP) surrogate…
Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, since the number of model parameters scales quadratically with the number of variables, these models can be easily…
We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process…
A broad class of stochastic volatility models are defined by systems of stochastic differential equations. While these models have seen widespread success in domains such as finance and statistical climatology, they typically lack an…
Creating low dimensional representations of a high dimensional data set is an important component in many machine learning applications. How to cluster data using their low dimensional embedded space is still a challenging problem in…
The Expectation-Maximization (EM) algorithm is one of the most popular methods used to solve the problem of parametric distribution-based clustering in unsupervised learning. In this paper, we propose to analyze a generalized EM (GEM)…
Distributed Gaussian process (DGP) is a popular approach to scale GP to big data which divides the training data into some subsets, performs local inference for each partition, and aggregates the results to acquire global prediction. To…
The Gaussian mixture model (GMM) provides a simple yet principled framework for clustering, with properties suitable for statistical inference. In this paper, we propose a new model-based clustering algorithm, called EGMM (evidential GMM),…
Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the…