Current browse context:
cs.DS
Change to browse by:
References & Citations
Computer Science > Data Structures and Algorithms
Title: Enumerative Data Compression with Non-Uniquely Decodable Codes
(Submitted on 13 Nov 2019)
Abstract: Non-uniquely decodable codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non-prefix-free codes, where a codeword can be a prefix of other(s), and thus, the codeword boundary information is essential for correct decoding. Although the codeword bit stream consumes significantly less space when compared to prefix--free codes, the additional disambiguation information makes it difficult to catch the performance of prefix-free codes in total. Previous studies considered compression with non-prefix-free codes by integrating rank/select dictionaries or wavelet trees to mark the code-word boundaries. In this study we focus on another dimension with a block--wise enumeration scheme that improves the compression ratios of the previous studies significantly. Experiments conducted on a known corpus showed that the proposed scheme successfully represents a source within its entropy, even performing better than the Huffman and arithmetic coding in some cases. The non-uniquely decodable codes also provides an intrinsic security feature due to lack of unique-decodability. We investigate this dimension as an opportunity to provide compressed data security without (or with less) encryption, and discuss various possible practical advantages supported by such codes.
Submission history
From: M. Oğuzhan Külekci [view email][v1] Wed, 13 Nov 2019 17:55:06 GMT (91kb,D)
Link back to: arXiv, form interface, contact.