Clustering with a new distance measure based on a dual-rooted tree - CICS Accéder directement au contenu
Article Dans Une Revue Information Sciences Année : 2013

Clustering with a new distance measure based on a dual-rooted tree

Résumé

This paper introduces a novel distance measure for clustering high dimensional data based on the hitting time of two Minimal Spanning Trees (MST) grown sequentially from a pair of points by Prim's algorithm. When the proposed measure is used in conjunction with spectral clustering, we obtain a powerful clustering algorithm that is able to separate neighboring non-convex shaped clusters and to account for local as well as global geometric features of the data set. Remarkably, the new distance measure is a true metric even if the Prim algorithm uses a non-metric dissimilarity measure to compute the edges of the MST. This metric property brings added flexibility to the proposed method. In particular, the method is applied to clustering non Euclidean quantities, such as probability distributions or spectra, using the Kullback-Liebler divergence as a base measure. We reduce computational complexity by applying consensus clustering to a small ensemble of dual rooted MSTs. We show that the resultant consensus spectral clustering with dual rooted MST is competitive with other clustering methods, both in terms of clustering performance and computational complexity. We illustrate the proposed clustering algorithm on public domain benchmark data for which the ground truth is known, on one hand, and on real-world astrophysical data on the other hand.
Fichier principal
Vignette du fichier
infoSc-v2.pdf (660.22 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00726005 , version 1 (28-08-2012)
hal-00726005 , version 2 (29-11-2013)

Identifiants

Citer

Laurent Galluccio, Olivier J.J. Michel, Pierre Comon, Mark Kliger, Alfred O. Hero. Clustering with a new distance measure based on a dual-rooted tree. Information Sciences, 2013, 251 (Dec), pp.96-113. ⟨10.1016/j.ins.2013.05.040⟩. ⟨hal-00726005v2⟩
536 Consultations
1122 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More