Minimal Absent Words in a Sliding Window and Applications to On-Line Pattern Matching

Abstract : An absent (or forbidden) word of a word y is a word that does not occur in y. It is then called minimal if all its proper factors occur in y. There exist linear-time and linear-space algorithms for computing all minimal absent words of y (Crochemore et al. in Inf Process Lett 67:111–117, 1998; Belazzougui et al. in ESA 8125:133–144, 2013; Barton et al. in BMC Bioinform 15:388, 2014). Minimal absent words are used for data compression (Crochemore et al. in Proc IEEE 88:1756–1768, 2000, Ota and Morita in Theoret Comput Sci 526:108–119, 2014) and for alignment-free sequence comparison by utilizing a metric based on minimal absent words (Chairungsee and Crochemore in Theoret Comput Sci 450:109–116, 2012). They are also used in molecular biology; for instance, three minimal absent words of the human genome were found to play a functional role in a coding region in Ebola virus genomes (Silva et al. in Bioinformatics 31:2421–2425, 2015). In this article we introduce a new application of minimal absent words for on-line pattern matching. Specifically, we present an algorithm that, given a pattern x and a text y, computes the distance between x and every window of size |x| on y. The running time is O(σ|y|)O(σ|y|) , where σσ is the size of the alphabet. Along the way, we show an O(σ|y|)O(σ|y|) -time and O(σ|x|)O(σ|x|) -space algorithm to compute the minimal absent words of every window of size |x| on y, together with some new combinatorial insight on minimal absent words.
Type de document :
Communication dans un congrès
Fundamentals of Computation Theory, Sep 2017, Bordeaux, France. Springer, 10472, pp.164 - 176, 2017, Fundamentals of Computation Theory. 〈http://fct2017.labri.fr/〉. 〈10.1007/978-3-662-55751-8_14〉
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal-upec-upem.archives-ouvertes.fr/hal-01616485
Contributeur : Maxime Crochemore <>
Soumis le : jeudi 26 octobre 2017 - 11:06:11
Dernière modification le : jeudi 10 mai 2018 - 02:06:37
Document(s) archivé(s) le : samedi 27 janvier 2018 - 12:10:24

Fichier

paper_37.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Maxime Crochemore, Alice Héliou, Gregory Kucherov, Laurent Mouchard, Solon Pissis, et al.. Minimal Absent Words in a Sliding Window and Applications to On-Line Pattern Matching. Fundamentals of Computation Theory, Sep 2017, Bordeaux, France. Springer, 10472, pp.164 - 176, 2017, Fundamentals of Computation Theory. 〈http://fct2017.labri.fr/〉. 〈10.1007/978-3-662-55751-8_14〉. 〈hal-01616485〉

Partager

Métriques

Consultations de la notice

241

Téléchargements de fichiers

38