Scalable Graph Attention-based Instance Selection Via Mini-batch Sampling And Hierarchical Hashing
2025 Β· Zahiriddin Rustamov, Ayham Zaitouny, Nazar Zaki
Abstract
Instance selection (IS) addresses the critical challenge of reducing dataset size while keeping informative characteristics, becoming increasingly important as datasets grow to millions of instances. Current IS methods often struggle with capturing complex relationships in high-dimensional spaces and scale with large datasets. This paper introduces a graph attention-based instance selection (GAIS) method that uses attention mechanisms to identify informative instances through their structural relationships in graph representations. We present two approaches for scalable graph construction: a distance-based mini-batch sampling technique that achieves dataset-size-independent complexity through strategic batch processing, and a hierarchical hashing approach that enables efficient similarity computation through random projections. The mini-batch approach keeps class distributions through stratified sampling, while the hierarchical hashing method captures relationships at multiple granular
Authors
(none)
Tags
Stats
Related papers
- An Instance Selection Algorithm For Big Data In High Imbalanced Datasets Based On LSH (2022)0.00
- A Graphical Heuristic For Reduction And Partitioning Of Large Datasets For Scalable Supervised Training (2019)4.52
- Graph Sampling Based Deep Metric Learning For Generalizable Person Re-identification (2021)20.14
- Instance-based Learning Using The Half-space Proximal Graph (2021)6.77
- Cascading Hierarchical Networks With Multi-task Balanced Loss For Fine-grained Hashing (2023)0.00
- Sparse-inductive Generative Adversarial Hashing For Nearest Neighbor Search (2023)0.00
- Graph-collaborated Auto-encoder Hashing For Multi-view Binary Clustering (2023)14.31
- Asymmetric Transfer Hashing With Adaptive Bipartite Graph Learning (2022)8.82