References & Citations
Computer Science > Information Theory
Title: Exact Reconstruction from Insertions in Synchronization Codes
(Submitted on 11 Apr 2016 (this version), latest version 7 Mar 2017 (v2))
Abstract: This work studies problems in data reconstruction, an important area with numerous applications. In particular, we examine the reconstruction of binary and nonbinary sequences from synchronization (insertion/deletion-correcting) codes. These sequences have been corrupted by a fixed number of symbol insertions (larger than the minimum edit distance of the code), yielding a number of distinct traces to be used for reconstruction. We wish to know the minimum number of traces needed for exact reconstruction.
This is a general version of a problem tackled by Levenshtein for uncoded sequences. We introduce an exact formula for the maximum number of common supersequences shared by sequences at a certain edit distance, yielding an upper bound on the number of distinct traces necessary to guarantee exact reconstruction. Without specific knowledge of the codewords, this upper bound is tight. We apply our results to the famous single deletion/insertion-correcting Varshamov-Tenengolts (VT) codes and show that a significant number of VT codeword pairs achieve the worst-case number of outputs needed for exact reconstruction. This result opens up a novel area for study: the development of codes with comparable rate and minimum distance properties that require fewer traces for reconstruction.
Submission history
From: Frederic Sala [view email][v1] Mon, 11 Apr 2016 15:37:25 GMT (237kb,D)
[v2] Tue, 7 Mar 2017 23:23:20 GMT (499kb,D)
Link back to: arXiv, form interface, contact.