Sequence Covering Similarity for Symbolic Sequence Comparison - Irisa Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2018

Sequence Covering Similarity for Symbolic Sequence Comparison

Résumé

This paper introduces the sequence covering similarity, that we formally define for evaluating the similarity between a symbolic sequence (string) and a set of symbolic sequences (strings). From this covering similarity we derive a pair-wise distance to compare two symbolic sequences. We show that this covering distance is a semimetric. Few examples are given to show how this string metric in $O(n \cdot log n)$ compares with the Levenshtein's distance that is in $O(n^2)$. A final example presents its application to plagiarism detection.
Fichier principal
Vignette du fichier
CoveringSimilarity-v2.pdf (237.41 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01689286 , version 1 (21-01-2018)
hal-01689286 , version 2 (21-02-2018)
hal-01689286 , version 3 (08-03-2018)

Identifiants

Citer

Pierre-François Marteau. Sequence Covering Similarity for Symbolic Sequence Comparison. 2018. ⟨hal-01689286v3⟩
287 Consultations
165 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More