Guess & Check Codes for Deletions, Insertions, and Synchronization

24 May 2017  ·  Serge Kas Hanna, Salim El Rouayheb ·

We consider the problem of constructing codes that can correct $\delta$ deletions occurring in an arbitrary binary string of length $n$ bits. Varshamov-Tenengolts (VT) codes, dating back to 1965, are zero-error single deletion $(\delta=1)$ correcting codes, and have an asymptotically optimal redundancy. Finding similar codes for $\delta \geq 2$ deletions remains an open problem. In this work, we relax the standard zero-error (i.e., worst-case) decoding requirement by assuming that the positions of the $\delta$ deletions (or insertions) are independent of the codeword. Our contribution is a new family of explicit codes, that we call Guess & Check (GC) codes, that can correct with high probability up to a constant number of $\delta$ deletions (or insertions). GC codes are systematic; and have deterministic polynomial time encoding and decoding algorithms. We also describe the application of GC codes to file synchronization.

PDF Abstract

Categories


Information Theory Information Theory

Datasets


  Add Datasets introduced or used in this paper