SAGHOG: Self-supervised Autoencoder For Generating HOG Features For Writer Retrieval
2024 Β· Marco Peer, Florian Kleber, Robert Sablatnig
Abstract
This paper introduces SAGHOG, a self-supervised pretraining strategy for writer retrieval using HOG features of the binarized input image. Our preprocessing involves the application of the Segment Anything technique to extract handwriting from various datasets, ending up with about 24k documents, followed by training a vision transformer on reconstructing masked patches of the handwriting. SAGHOG is then finetuned by appending NetRVLAD as an encoding layer to the pretrained encoder. Evaluation of our approach on three historical datasets, Historical-WI, HisFrag20, and GRK-Papyri, demonstrates the effectiveness of SAGHOG for writer retrieval. Additionally, we provide ablation studies on our architecture and evaluate un- and supervised finetuning. Notably, on HisFrag20, SAGHOG outperforms related work with a mAP of 57.2 % - a margin of 11.6 % to the current state of the art, showcasing its robustness on challenging data, and is competitive on even small datasets, e.g. GRK-Papyri, where w
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Vision Transformers For Writer Retrieval (2024)5.24
- Online Writer Retrieval With Chinese Handwritten Phrases: A Synergistic Temporal-frequency Representation Learning Approach (2024)7.11
- Feature Mixing For Writer Retrieval And Identification On Papyri Fragments (2023)7.16
- Planning Ahead In Generative Retrieval: Guiding Autoregressive Generation Through Simultaneous Decoding (2024)8.82
- HADA: A Graph-based Amalgamation Framework In Image-text Retrieval (2023)7.05
- Writer Identification And Writer Retrieval Based On Netvlad With Re-ranking (2020)8.82
- Language-agnostic Visual Embeddings For Cross-script Handwriting Retrieval (2026)0.00
- Adversarial Training For Sketch Retrieval (2016)11.19