We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Clustering is Efficient for Approximate Maximum Inner Product Search

Abstract: Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes. Solutions based on locality-sensitive hashing (LSH) as well as tree-based solutions have been investigated in the recent literature, to perform approximate MIPS in sublinear time. In this paper, we compare these to another extremely simple approach for solving approximate MIPS, based on variants of the k-means clustering algorithm. Specifically, we propose to train a spherical k-means, after having reduced the MIPS problem to a Maximum Cosine Similarity Search (MCSS). Experiments on two standard recommendation system benchmarks as well as on large vocabulary word embeddings, show that this simple approach yields much higher speedups, for the same retrieval precision, than current state-of-the-art hashing-based and tree-based methods. This simple method also yields more robust retrievals when the query is corrupted by noise.
Comments: 10 pages, Under review at ICLR 2016
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as: arXiv:1507.05910 [cs.LG]
  (or arXiv:1507.05910v3 [cs.LG] for this version)

Submission history

From: Sarath Chandar [view email]
[v1] Tue, 21 Jul 2015 16:53:12 GMT (79kb,D)
[v2] Fri, 20 Nov 2015 16:36:09 GMT (548kb,D)
[v3] Mon, 30 Nov 2015 02:26:44 GMT (548kb,D)

Link back to: arXiv, form interface, contact.