JnXbUKtLzz@OpenReview

Total: 1

#1 Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs [PDF3] [Copy] [Kimi] [REL]

Authors: Siddharth Gollapudi, Ravishankar Krishnaswamy, Kirankumar Shiragur, Harsh Wardhan

Graph-based data structures have become powerful and ubiquitous tools for scalable approximate nearest-neighbor (ANN) search over the past decade. In spite of their apparent practical performance, there has only recently been progress on the **worst-case** performance of these data structures. Indeed, the influential work of Indyx and Xu (2023) introduced the key concept of $\alpha$-reachable graphs, showing that graphs constructed by the DiskANN algorithm (Subramanya, et. al. 2023) produce an $\left(\frac{\alpha+1}{\alpha-1}\right)$-approximate solution with a simple best-first search that runs in poly-logarithmic query time. In our work, we improve and generalize this analysis as follows: - We introduce **sorted** $\alpha$-reachable graphs, and use this notion to obtain a stronger approximation factor of $\frac{\alpha}{\alpha-1}$ for the DiskANN algorithm on Euclidean metrics. - We present the **first** worst-case theoretical analysis for the popular **beam-search** algorithm, which is used in practice to search these graphs for $k > 1$ candidate nearest neighbors.We also present empirical results validating the significance of sorted $\alpha$-reachable graphs, which aligns with our theoretical findings.

Subject: ICML.2025 - Poster