References & Citations
Computer Science > Data Structures and Algorithms
Title: FAST-PPR: Scaling Personalized PageRank Estimation for Large Graphs
(Submitted on 11 Apr 2014 (v1), last revised 22 Aug 2014 (this version, v2))
Abstract: We propose a new algorithm, FAST-PPR, for estimating personalized PageRank: given start node $s$ and target node $t$ in a directed graph, and given a threshold $\delta$, FAST-PPR estimates the Personalized PageRank $\pi_s(t)$ from $s$ to $t$, guaranteeing a small relative error as long $\pi_s(t)>\delta$. Existing algorithms for this problem have a running-time of $\Omega(1/\delta)$; in comparison, FAST-PPR has a provable average running-time guarantee of ${O}(\sqrt{d/\delta})$ (where $d$ is the average in-degree of the graph). This is a significant improvement, since $\delta$ is often $O(1/n)$ (where $n$ is the number of nodes) for applications. We also complement the algorithm with an $\Omega(1/\sqrt{\delta})$ lower bound for PageRank estimation, showing that the dependence on $\delta$ cannot be improved.
We perform a detailed empirical study on numerous massive graphs, showing that FAST-PPR dramatically outperforms existing algorithms. For example, on the 2010 Twitter graph with 1.5 billion edges, for target nodes sampled by popularity, FAST-PPR has a $20$ factor speedup over the state of the art. Furthermore, an enhanced version of FAST-PPR has a $160$ factor speedup on the Twitter graph, and is at least $20$ times faster on all our candidate graphs.
Submission history
From: Peter Lofgren [view email][v1] Fri, 11 Apr 2014 18:41:04 GMT (762kb,D)
[v2] Fri, 22 Aug 2014 03:36:09 GMT (1433kb,D)
Link back to: arXiv, form interface, contact.