Hybrid, Unified And Iterative: A Novel Framework For Text-based Person Anomaly Retrieval
2025 Β· Tien-Huy Nguyen, Huu-Loc Tran, Huu-Phong Phan-Nguyen, et al.
Abstract
Text-based person anomaly retrieval has emerged as a challenging task, with most existing approaches relying on complex deep-learning techniques. This raises a research question: How can the model be optimized to achieve greater fine-grained features? To address this, we propose a Local-Global Hybrid Perspective (LHP) module integrated with a Vision-Language Model (VLM), designed to explore the effectiveness of incorporating both fine-grained features alongside coarse-grained features. Additionally, we investigate a Unified Image-Text (UIT) model that combines multiple objective loss functions, including Image-Text Contrastive (ITC), Image-Text Matching (ITM), Masked Language Modeling (MLM), and Masked Image Modeling (MIM) loss. Beyond this, we propose a novel iterative ensemble strategy, by combining iteratively instead of using model results simultaneously like other ensemble methods. To take advantage of the superior performance of the LHP model, we introduce a novel feature selecti
Authors
(none)
Tags
Stats
Related papers
- Multi-path Exploration And Feedback Adjustment For Text-to-image Person Retrieval (2024)0.00
- See Finer, See More: Implicit Modality Alignment For Text-based Person Retrieval (2022)18.39
- Decoupled Cross-modal Alignment Network For Text-rgbt Person Retrieval And A High-quality Benchmark (2025)0.00
- Boosting Weak Positives For Text Based Person Search (2025)0.00
- Beat: Bi-directional One-to-many Embedding Alignment For Text-based Person Retrieval (2024)10.85
- Look Before You Leap: Improving Text-based Person Retrieval By Learning A Consistent Cross-modal Common Manifold (2022)15.34
- Cross-modal Implicit Relation Reasoning And Aligning For Text-to-image Person Retrieval (2023)18.15
- Dynamic Uncertainty Learning With Noisy Correspondence For Text-based Person Search (2025)7.50