Embassi: Embedding Assignment Costs For Similarity Search In Large Graph Databases
2021 Β· Franka Bause, Erich Schubert, Nils M. Kriege
Abstract
The graph edit distance is an intuitive measure to quantify the dissimilarity of graphs, but its computation is NP-hard and challenging in practice. We introduce methods for answering nearest neighbor and range queries regarding this distance efficiently for large databases with up to millions of graphs. We build on the filter-verification paradigm, where lower and upper bounds are used to reduce the number of exact computations of the graph edit distance. Highly effective bounds for this involve solving a linear assignment problem for each graph in the database, which is prohibitive in massive datasets. Index-based approaches typically provide only weak bounds leading to high computational costs verification. In this work, we derive novel lower bounds for efficient filtering from restricted assignment problems, where the cost function is a tree metric. This special case allows embedding the costs of optimal assignments isometrically into \(\ell_1\) space, rendering efficient indexing
Authors
(none)
Tags
Stats
Related papers
- Massively-parallel Similarity Join, Edge-isoperimetry, And Distance Correlations On The Hypercube (2016)2.26
- Accurate And Fast Retrieval For Complex Non-metric Data Via Neighborhood Graphs (2019)0.00
- Utilizing Low-dimensional Molecular Embeddings For Rapid Chemical Similarity Search (2024)4.52
- Simgnn: A Neural Network Approach To Fast Graph Similarity Computation (2018)0.00
- A Scalable Solution To The Nearest Neighbor Search Problem Through Local-search Methods On Neighbor Graphs (2017)3.58
- Billion-scale Similarity Search Using A Hybrid Indexing Approach With Advanced Filtering (2025)4.52
- Meta-path Guided Embedding For Similarity Search In Large-scale Heterogeneous Information Networks (2016)0.00
- Hd-index: Pushing The Scalability-accuracy Boundary For Approximate Knn Search In High-dimensional Spaces (2018)14.02