References & Citations
Computer Science > Information Theory
Title: Efficient Compression of Long Arbitrary Sequences with No Reference at the Encoder
(Submitted on 23 Feb 2020 (v1), last revised 6 Sep 2020 (this version, v2))
Abstract: In a distributed information application an encoder compresses an arbitrary vector while a similar reference vector is available to the decoder as side information. For the Hamming-distance similarity measure, and when guaranteed perfect reconstruction is required, we present two contributions to the solution of this problem. One result shows that when a set of potential reference vectors is available to the encoder, lower compression rates can be achieved when the set satisfies a certain clustering property. Another result reduces the best known decoding complexity from exponential in the vector length $n$ to $O(n^{1.5})$ by generalized concatenation of inner coset codes and outer error-correcting codes. One potential application of the results is the compression of DNA sequences, where similar (but not identical) reference vectors are shared among senders and receivers.
Submission history
From: Yuval Cassuto [view email][v1] Sun, 23 Feb 2020 12:43:07 GMT (86kb,D)
[v2] Sun, 6 Sep 2020 08:54:35 GMT (87kb,D)
Link back to: arXiv, form interface, contact.