Awesome Tracking
Tracking is one of the most active areas in Awesome Computer Vision β 877 papers in this collection, evaluated on datasets like YouTube-VOS, MOT-17, DAVIS 2017. A strong starting point is "Fairmot: On The Fairness Of Detection And Re-identification In Multiple Object Tracking".
Datasets & benchmarks
Key papers
- Fairmot: On The Fairness Of Detection And Re-identification In Multiple Object Tracking (2020)Yifu Zhang, Chunyu Wang, Xinggang Wang, et al.31.02
- Towards Real-time Multi-object Tracking (2019)Zhongdao Wang, Liang Zheng, Yixuan Liu, et al.28.64
- Higherhrnet: Scale-aware Representation Learning For Bottom-up Human Pose Estimation (2019)Bowen Cheng, Bin Xiao, Jingdong Wang, et al.27.97
- Rethinking The Competition Between Detection And Reid In Multi-object Tracking (2020)Chao Liang, Zhipeng Zhang, Xue Zhou, et al.24.06
- Discriminative Scale Space Tracking (2016)Martin Danelljan, Gustav HΓ€ger, Fahad Shahbaz Khan, et al.23.00
- Tubetk: Adopting Tubes To Track Multi-object In A One-step Training Model (2020)Bo Pang, Yizhuo Li, Yifan Zhang, et al.22.36
- How To Train Your Deep Multi-object Tracker (2019)Yihong Xu, Aljosa Osep, Yutong Ban, et al.22.35
- Deep Affinity Network For Multiple Object Tracking (2018)Shijie Sun, Naveed Akhtar, Huansheng Song, et al.22.06
- Tracking Without Bells And Whistles (2019)Philipp Bergmann, Tim Meinhardt, Laura Leal-Taixe22.04
- Hand Keypoint Detection In Single Images Using Multiview Bootstrapping (2017)Tomas Simon, Hanbyul Joo, Iain Matthews, et al.22.00
- Pose-guided Visible Part Matching For Occluded Person Reid (2020)Shang Gao, Jingya Wang, Huchuan Lu, et al.21.99
- Ranet: Ranking Attention Network For Fast Video Object Segmentation (2019)Ziqin Wang, Jun Xu, Li Liu, et al.21.97
- RTMO: Towards High-performance One-stage Real-time Multi-person Pose Estimation (2023)Peng Lu, Tao Jiang, Yining Li, et al.21.36
- Transcenter: Transformers With Dense Representations For Multiple-object Tracking (2021)Yihong Xu, Yutong Ban, Guillaume Delorme, et al.20.92
- Video Object Segmentation Using Space-time Memory Networks (2019)Seoung Wug Oh, Joon-Young Lee, Ning Xu, et al.20.78
- Swiftnet: Real-time Video Object Segmentation (2021)Haochen Wang, Xiaolong Jiang, Haibing Ren, et al.20.57
- MOTS: Multi-object Tracking And Segmentation (2019)Paul Voigtlaender, Michael Krause, Aljosa Osep, et al.20.41
- Youtube-vos: Sequence-to-sequence Video Object Segmentation (2018)Ning Xu, Linjie Yang, Yuchen Fan, et al.20.04
- Learning Feature Pyramids For Human Pose Estimation (2017)Wei Yang, Shuang Li, Wanli Ouyang, et al.19.60
- Quasi-dense Similarity Learning For Multiple Object Tracking (2020)Jiangmiao Pang, Linlu Qiu, Xia Li, et al.19.58
- Mhformer: Multi-hypothesis Transformer For 3D Human Pose Estimation (2021)Wenhao Li, Hong Liu, Hao Tang, et al.19.47
- Lighttrack: A Generic Framework For Online Top-down Human Pose Tracking (2019)Guanghan Ning, Heng Huang19.33
- Diverse Part Discovery: Occluded Person Re-identification With Part-aware Transformer (2021)Yulin Li, Jianfeng He, Tianzhu Zhang, et al.19.19
- Tokenpose: Learning Keypoint Tokens For Human Pose Estimation (2021)Yanjie Li, Shoukui Zhang, Zhicheng Wang, et al.19.16
- Cosypose: Consistent Multi-view Multi-object 6D Pose Estimation (2020)Yann LabbΓ©, Justin Carpentier, Mathieu Aubry, et al.19.04
- Bi-directional Adapter For Multi-modal Tracking (2023)Bing Cao, Junliang Guo, Pengfei Zhu, et al.18.79
- Learning To Estimate Hidden Motions With Global Motion Aggregation (2021)Shihao Jiang, Dylan Campbell, Yao Lu, et al.18.71
- Lasot: A High-quality Large-scale Single Object Tracking Benchmark (2020)Heng Fan, Hexin Bai, Liting Lin, et al.18.67
- Cross-view Tracking For Multi-human 3D Pose Estimation At Over 100 FPS (2020)Long Chen, Haizhou Ai, Rui Chen, et al.18.65
- Dancetrack: Multi-object Tracking In Uniform Appearance And Diverse Motion (2021)Peize Sun, Jinkun Cao, Yi Jiang, et al.18.56
- Segment As Points For Efficient Online Multi-object Tracking And Segmentation (2020)Zhenbo Xu, Wei Zhang, Xiao Tan, et al.18.40
- Person Re-identification By Camera Correlation Aware Feature Augmentation (2017)Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, et al.18.33
- D3S -- A Discriminative Single Shot Segmentation Tracker (2019)Alan LukeΕΎiΔ, JiΕΓ Matas, Matej Kristan18.07
- Hourglass Tokenizer For Efficient Transformer-based 3D Human Pose Estimation (2023)Wenhao Li, Mengyuan Liu, Hong Liu, et al.17.81
- Spatial-temporal Relation Networks For Multi-object Tracking (2019)Jiarui Xu, Yue Cao, Zheng Zhang, et al.17.66
- Humans In 4D: Reconstructing And Tracking Humans With Transformers (2023)Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, et al.17.58
- Spm-tracker: Series-parallel Matching For Real-time Visual Object Tracking (2019)Guangting Wang, Chong Luo, Zhiwei Xiong, et al.17.55
- Blazingly Fast Video Object Segmentation With Pixel-wise Metric Learning (2018)Yuhua Chen, Jordi Pont-Tuset, Alberto Montes, et al.17.46
- Parsing-based View-aware Embedding Network For Vehicle Re-identification (2020)Dechao Meng, Liang Li, Xuejing Liu, et al.17.42
- Relation Distillation Networks For Video Object Detection (2019)Jiajun Deng, Yingwei Pan, Ting Yao, et al.17.32
- Intra-inter Camera Similarity For Unsupervised Person Re-identification (2021)Shiyu Xuan, Shiliang Zhang17.32
- VRSTC: Occlusion-free Video Person Re-identification (2019)Ruibing Hou, Bingpeng Ma, Hong Chang, et al.17.14
- Learning To Track With Object Permanence (2021)Pavel Tokmakov, Jie Li, Wolfram Burgard, et al.17.11
- Sg-net: Spatial Granularity Network For One-stage Video Instance Segmentation (2021)Dongfang Liu, Yiming Cui, Wenbo Tan, et al.17.02
- Dmm-net: Differentiable Mask-matching Network For Video Object Segmentation (2019)Xiaohui Zeng, Renjie Liao, Li Gu, et al.17.02
- Separable Self And Mixed Attention Transformers For Efficient Object Tracking (2023)Goutam Yelluru Gopal, Maria A. Amer17.00
- Modular Interactive Video Object Segmentation: Interaction-to-mask, Propagation And Difference-aware Fusion (2021)Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang16.86
- Arttrack: Articulated Multi-person Tracking In The Wild (2016)Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, et al.16.82
- Epipolar Transformers (2020)Yihui He, Rui Yan, Katerina Fragkiadaki, et al.16.71
- Beyond Triplet Loss: Person Re-identification With Fine-grained Difference-aware Pairwise Loss (2020)Cheng Yan, Guansong Pang, Xiao Bai, et al.16.67
- STA: Spatial-temporal Attention For Large-scale Video-based Person Re-identification (2018)Yang Fu, Xiaoyang Wang, Yunchao Wei, et al.16.61
- Leveraging Photometric Consistency Over Time For Sparsely Supervised Hand-object Reconstruction (2020)Yana Hasson, Bugra Tekin, Federica Bogo, et al.16.53
- End-to-end Referring Video Object Segmentation With Multimodal Transformers (2021)Adam Botach, Evgenii Zheltonozhskii, Chaim Baskin16.45
- Efficient Loftr: Semi-dense Local Feature Matching With Sparse-like Speed (2024)Yifan Wang, Xingyi He, Sida Peng, et al.16.36
- Harvesting Multiple Views For Marker-less 3D Human Pose Annotations (2017)Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, et al.16.28
- Spatial-temporal Person Re-identification (2018)Guangcong Wang, Jianhuang Lai, Peigen Huang, et al.16.21
- Sparsett: Visual Tracking With Sparse Transformers (2022)Zhihong Fu, Zehua Fu, Qingjie Liu, et al.16.19
- Video Panoptic Segmentation (2020)Dahun Kim, Sanghyun Woo, Joon-Young Lee, et al.16.19
- Efficient Regional Memory Network For Video Object Segmentation (2021)Haozhe Xie, Hongxun Yao, Shangchen Zhou, et al.16.19
- Kernelized Memory Network For Video Object Segmentation (2020)Hongje Seong, Junhyuk Hyun, Euntai Kim16.14