State-Space Dynamics Distance for Clustering Sequential Data

Darío García-García; Emilio Parrado-Hernández; Fernando Díaz-de-María

State-Space Dynamics Distance for Clustering Sequential Data

Machine Learning 2010-04-13 v1

Authors: Darío García-García , Emilio Parrado-Hernández , Fernando Díaz-de-María

Abstract

This paper proposes a novel similarity measure for clustering sequential data. We first construct a common state-space by training a single probabilistic model with all the sequences in order to get a unified representation for the dataset. Then, distances are obtained attending to the transition matrices induced by each sequence in that state-space. This approach solves some of the usual overfitting and scalability issues of the existing semi-parametric techniques, that rely on training a model for each sequence. Empirical studies on both synthetic and real-world datasets illustrate the advantages of the proposed similarity measure for clustering sequences.

Keywords

cluster analysis distance metric learning similarity search

Cite

@article{arxiv.1004.1982,
  title  = {State-Space Dynamics Distance for Clustering Sequential Data},
  author = {Darío García-García and Emilio Parrado-Hernández and Fernando Díaz-de-María},
  journal= {arXiv preprint arXiv:1004.1982},
  year   = {2010}
}

State-Space Dynamics Distance for Clustering Sequential Data

Abstract

Keywords

Cite

Related papers