Circular Pattern Matching with $k$ Mismatches

13 Jan 2020  ·  Charalampopoulos Panagiotis, Kociumaka Tomasz, Pissis Solon P., Radoszewski Jakub, Rytter Wojciech, Straszyński Juliusz, Waleń Tomasz, Zuba Wiktor ·

The $k$-mismatch problem consists in computing the Hamming distance between a pattern $P$ of length $m$ and every length-$m$ substring of a text $T$ of length $n$, if this distance is no more than $k$. In many real-world applications, any cyclic rotation of $P$ is a relevant pattern, and thus one is interested in computing the minimal distance of every length-$m$ substring of $T$ and any cyclic rotation of $P$. This is the circular pattern matching with $k$ mismatches ($k$-CPM) problem. A multitude of papers have been devoted to solving this problem but, to the best of our knowledge, only average-case upper bounds are known. In this paper, we present the first non-trivial worst-case upper bounds for the $k$-CPM problem. Specifically, we show an $O(nk)$-time algorithm and an $O(n+\frac{n}{m}\,k^4)$-time algorithm. The latter algorithm applies in an extended way a technique that was very recently developed for the $k$-mismatch problem [Bringmann et al., SODA 2019]. A preliminary version of this work appeared at FCT 2019. In this version we improve the time complexity of the main algorithm from $O(n+\frac{n}{m}\,k^5)$ to $O(n+\frac{n}{m}\,k^4)$.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Data Structures and Algorithms

Datasets


  Add Datasets introduced or used in this paper