We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.DS

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Data Structures and Algorithms

Title: Ribbon filter: practically smaller than Bloom and Xor

Abstract: Filter data structures over-approximate a set of hashable keys, i.e. set membership queries may incorrectly come out positive. A filter with false positive rate $f \in (0,1]$ is known to require $\ge \log_2(1/f)$ bits per key. At least for larger $f \ge 2^{-4}$, existing practical filters require a space overhead of at least 20% with respect to this information-theoretic bound.
We introduce the Ribbon filter: a new filter for static sets with a broad range of configurable space overheads and false positive rates with competitive speed over that range, especially for larger $f \ge 2^{-7}$. In many cases, Ribbon is faster than existing filters for the same space overhead, or can achieve space overhead below 10% with some additional CPU time. An experimental Ribbon design with load balancing can even achieve space overheads below 1%.
A Ribbon filter resembles an Xor filter modified to maximize locality and is constructed by solving a band-like linear system over Boolean variables. In previous work, Dietzfelbinger and Walzer describe this linear system and an efficient Gaussian solver. We present and analyze a faster, more adaptable solving process we call "Rapid Incremental Boolean Banding ON the fly," which resembles hash table construction. We also present and analyze an attractive Ribbon variant based on making the linear system homogeneous, and describe several more practical enhancements.
Comments: 14 pages, 7 figures
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB)
ACM classes: E.1; E.2
Cite as: arXiv:2103.02515 [cs.DS]
  (or arXiv:2103.02515v2 [cs.DS] for this version)

Submission history

From: Peter C Dillinger [view email]
[v1] Wed, 3 Mar 2021 16:41:36 GMT (435kb,D)
[v2] Mon, 8 Mar 2021 16:14:45 GMT (435kb,D)

Link back to: arXiv, form interface, contact.