Languages of lossless seeds

Abstract : Several algorithms for similarity search employ seeding techniques to quickly discard very dissimilar regions. In this paper, we study theoretical properties of lossless seeds, i.e., spaced seeds having full sensitivity. We prove that lossless seeds coincide with languages of certain sofic subshifts, hence they can be recognized by finite automata. Moreover, we show that these subshifts are fully given by the number of allowed errors k and the seed margin ℓ. We also show that for a fixed k, optimal seeds must asymptotically satisfy ℓ ∈ Q(m^(k/k+1)).
Document type :
Conference papers
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download

https://hal-upec-upem.archives-ouvertes.fr/hal-01018062
Contributor : Karel Břinda <>
Submitted on : Thursday, July 3, 2014 - 3:35:21 PM
Last modification on : Thursday, July 5, 2018 - 2:46:11 PM
Long-term archiving on : Friday, October 3, 2014 - 11:35:53 AM

File

1403.6414.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Karel Břinda. Languages of lossless seeds. Proceedings of the 14th International Conference on Automata and Formal Languages (AFL), May 2014, Szeged, Hungary. pp.139-150, ⟨10.4204/eptcs.151.9⟩. ⟨hal-01018062⟩

Share

Metrics

Record views

317

Files downloads

152