Multi-focused Video Group Activities Hashing
2025 Β· Zhongmiao Qi, Yan Jiang, Bolin Zhang, et al.
Abstract
With the explosive growth of video data in various complex scenarios, quickly retrieving group activities has become an urgent problem. However, many tasks can only retrieve videos focusing on an entire video, not the activity granularity. To solve this problem, we propose a new STVH (spatiotemporal interleaved video hashing) technique for the first time. Through a unified framework, the STVH simultaneously models individual object dynamics and group interactions, capturing the spatiotemporal evolution on both group visual features and positional features. Moreover, in real-life video retrieval scenarios, it may sometimes require activity features, while at other times, it may require visual features of objects. We then further propose a novel M-STVH (multi-focused spatiotemporal video hashing) as an enhanced version to handle this difficult task. The advanced method incorporates hierarchical feature integration through multi-focused representation learning, allowing the model to joint
Authors
(none)
Tags
Stats
Related papers
- Encode The Unseen: Predictive Video Hashing For Scalable Mid-stream Retrieval (2020)3.58
- CHAIN: Exploring Global-local Spatio-temporal Information For Improved Self-supervised Video Hashing (2023)8.60
- Deep Heterogeneous Hashing For Face Video Retrieval (2019)9.92
- Self-supervised Video Hashing With Hierarchical Binary Auto-encoder (2018)17.81
- Delving Deeper: Hierarchical Visual Perception For Robust Video-text Retrieval (2026)1.24
- Dual-stream Knowledge-preserving Hashing For Unsupervised Video Retrieval (2023)9.23
- Query By Activity Video In The Wild (2023)0.00
- HVD: Human Vision-driven Video Representation Learning For Text-video Retrieval (2026)0.00