Near-Optimality for Single-Source Personalized PageRank

Abstract

The \emph{Single-Source Personalized PageRank} (SSPPR) query is central to graph OLAP, measuring the probability $\pi(s,t)$ that an $\alpha$-decay random walk from node $s$ terminates at $t$. Despite decades of research, a significant gap remains between upper and lower bounds for its computational complexity. Existing upper bounds are $O\left(\min\left(\frac{\log(1/\epsilon)}{\epsilon^2}, \frac{\sqrt{m \log n}}{\epsilon}, m \log \frac{1}{\epsilon}\right)\right)$ for SSPPR-A and $O\left(\min\left(\frac{\log(1/n)}{\delta}, \sqrt{m \log(n/\delta)}, m \log \left(\frac{\log(n)}{m\delta}\right)\right)\right)$ for SSPPR-R, with trivial lower bounds of $\Omega(\min(n,1/\epsilon))$ and $\Omega(\min(n,1/\delta))$. This work narrows or closes this gap. We improve the upper bounds for SSPPR-A and SSPPR-R to $O\left(\frac{1}{\epsilon^2}\right)$ and $O\left(\min\left(\frac{\log(1/\delta)}{\delta}, m + n \log(n) \log \left(\frac{\log(n)}{m\delta}\right)\right)\right)$, respectively, offering improvements by factors of $\log(1/\epsilon)$ and $\log\left(\frac{\log(n)}{m\delta}\right)$. On the lower bound side, we establish stronger results: $\Omega(\min(m, 1/\epsilon^2))$ for SSPPR-A and $\Omega(\min(m, \frac{\log(1/\delta)}{\delta}))$ for SSPPR-R, strengthening theoretical foundations. Our upper and lower bounds for SSPPR-R coincide for graphs with $m \in \Omega(n \log^2 n)$ and any threshold $\delta, 1/\delta \in O(\text{poly}(n))$, achieving theoretical optimality in most graph regimes. The SSPPR-A query attains partial optimality for large error thresholds, matching our new lower bound. This is the first optimal result for SSPPR queries. Our techniques generalize to the Single-Target Personalized PageRank (STPPR) query, improving its lower bound from $\Omega(\min(n, 1/\delta))$ to $\Omega(\min(m, \frac{n}{\delta} \log n))$, matching the upper bound and revealing its optimality.

Abstract

Related papers