We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.IT

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Information Theory

Title: Binarized Johnson-Lindenstrauss embeddings

Abstract: We consider the problem of encoding a set of vectors into a minimal number of bits while preserving information on their Euclidean geometry. We show that this task can be accomplished by applying a Johnson-Lindenstrauss embedding and subsequently binarizing each vector by comparing each entry of the vector to a uniformly random threshold. Using this simple construction we produce two encodings of a dataset such that one can query Euclidean information for a pair of points using a small number of bit operations up to a desired additive error - Euclidean distances in the first case and inner products and squared Euclidean distances in the second. In the latter case, each point is encoded in near-linear time. The number of bits required for these encodings is quantified in terms of two natural complexity parameters of the dataset - its covering numbers and localized Gaussian complexity - and shown to be near-optimal.
Subjects: Information Theory (cs.IT); Data Structures and Algorithms (cs.DS); Metric Geometry (math.MG)
Cite as: arXiv:2009.08320 [cs.IT]
  (or arXiv:2009.08320v1 [cs.IT] for this version)

Submission history

From: Sjoerd Dirksen [view email]
[v1] Thu, 17 Sep 2020 14:12:40 GMT (27kb)

Link back to: arXiv, form interface, contact.