Arrêt de service programmé du vendredi 10 juin 16h jusqu’au lundi 13 juin 9h. Pour en savoir plus
Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Iterative Spaced Seed Hashing: Closing the Gap Between Spaced Seed Hashing and k-mer Hashing

Abstract : Alignment-free classification of sequences has enabled high-throughput processing of sequencing data in many bioinformatics pipelines. Much work has been done to speed-up the indexing of k-mers through hash-table and other data structures. These efforts have led to very fast indexes, but because they are k-mer based, they often lack sensitivity due to sequencing errors or polymorphisms. Spaced seeds are a special type of pattern that accounts for errors or mutations. They allow to improve the sensitivity and they are now routinely used instead of k-mers in many applications. The major drawback of spaced seeds is that they cannot be efficiently hashed and thus their usage increases substantially the computational time. In this paper we address the problem of efficient spaced seed hashing. We propose an iterative algorithm that combines multiple spaced seed hashes by exploiting the similarity of adjacent hash values in order to efficiently compute the next hash. We report a series of experiments on HTS reads hashing, with several spaced seeds. Our algorithm can compute the hashing values of spaced seeds with a speedup of 6.2x, outperforming previous methods. Software and Datasets are available at ISSH
Type de document :
Communication dans un congrès
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-02146404
Contributeur : Laurent Noé Connectez-vous pour contacter le contributeur
Soumis le : mercredi 5 juin 2019 - 12:04:42
Dernière modification le : jeudi 24 mars 2022 - 03:43:02

Fichier

ISSH_Camera.pdf
Accord explicite pour ce dépôt

Identifiants

Citation

Enrico Petrucci, Laurent Noé, Cinzia Pizzi, Matteo Comin. Iterative Spaced Seed Hashing: Closing the Gap Between Spaced Seed Hashing and k-mer Hashing. 15th International Symposium on Bioinformatics Research and Applications (ISBRA), Jun 2019, Barcelona, Spain. pp.208-219, ⟨10.1007/978-3-030-20242-2_18⟩. ⟨hal-02146404⟩

Partager

Métriques

Consultations de la notice

115

Téléchargements de fichiers

201