Skip to Main content Skip to Navigation
Reports

A Basis for Repeated Motifs in Pattern Discovery and Text Mining

Abstract : We present a new notion of basis that is able to generate the repeated motifs (possibly exponential in number) that appear at least twice with don't care symbols in a string of length n over an alphabet . Our basis has some interesting features such as being (a) a subset of the previously de ned bases, (b) truly linear as its motifs are less than n in number and appear in the string for a total of 2n times at most; (c) symmetric as the basis of the reversed string is the reverse of the basis; (d) computable in polynomial time, namely, in O(n log n log jj) time.
Complete list of metadatas

https://hal-upec-upem.archives-ouvertes.fr/hal-00627828
Contributor : Maxime Crochemore <>
Submitted on : Thursday, September 29, 2011 - 4:25:02 PM
Last modification on : Wednesday, February 26, 2020 - 7:06:06 PM
Long-term archiving on: : Friday, December 30, 2011 - 2:30:31 AM

File

Hal.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00627828, version 1

Collections

Citation

Nadia Pisanti, Maxime Crochemore, Roberto Grossi, Marie-France Sagot. A Basis for Repeated Motifs in Pattern Discovery and Text Mining. 2002. ⟨hal-00627828⟩

Share

Metrics

Record views

326

Files downloads

119