Channel Recurrent Attention Networks For Video Pedestrian Retrieval
2020 Β· Pengfei Fang, Pan Ji, Jieming Zhou, et al.
Abstract
Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed \{\it channel recurrent attention network\}, for the task of video pedestrian retrieval. The main attention unit, \textit\{channel recurrent attention\}, identifies attention maps at the frame level by jointly leveraging spatial and channel patterns via a recurrent neural network. This channel recurrent attention is designed to build a global receptive field by recurrently receiving and learning the spatial vectors. Then, a \textit\{set aggregation\} cell is employed to generate a compact video representation. Empirical experimental results demonstrate the superior performance of the proposed deep network, outperforming current state-of-the-art results across standard video person retrieval benchmarks, and a thorough ablation study shows the effectiveness of the propose
Authors
(none)
Tags
Stats
Related papers
- Pose-aided Video-based Person Re-identification Via Recurrent Graph Convolutional Network (2022)10.97
- Multi-direction And Multi-scale Pyramid In Transformer For Video-based Pedestrian Retrieval (2022)14.73
- VRAG: Region Attention Graphs For Content-based Video Retrieval (2022)0.00
- All The Attention You Need: Global-local, Spatial-channel Attention For Image Retrieval (2021)13.97
- Clothing Retrieval With Visual Attention Model (2017)12.10
- Query-centric Audio-visual Cognition Network For Moment Retrieval, Segmentation And Step-captioning (2024)3.58
- A Generic Visualization Approach For Convolutional Neural Networks (2020)6.34
- HVD: Human Vision-driven Video Representation Learning For Text-video Retrieval (2026)0.00