References & Citations
Computer Science > Data Structures and Algorithms
Title: Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs
(Submitted on 22 Feb 2021)
Abstract: For a directed graph $G$ with $n$ vertices and a start vertex $u_{\sf start}$, we wish to (approximately) sample an $L$-step random walk over $G$ starting from $u_{\sf start}$ with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be $\tilde{\Theta}(n \cdot L)$. Prior to our work, a better space complexity was only known with $\tilde{O}(\sqrt{L})$ passes.
We settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is $\tilde{\Theta}(n \cdot \sqrt{L})$, by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes $p$, and shows that any $p$-pass algorithm for this problem uses $\tilde{\Omega}(n \cdot L^{1/p})$ space. In addition, we show a similar $\tilde{\Theta}(n \cdot \sqrt{L})$ bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an $L$-step random walk from every vertex in the graph.
Link back to: arXiv, form interface, contact.