How To Read Paintings: Semantic Art Understanding With Multi-modal Retrieval
2018 Β· Noa Garcia, George Vogiatzis
Abstract
Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models sho
Authors
(none)
Tags
Stats
Related papers
- Artseek: Deep Artwork Understanding Via Multimodal In-context Reasoning And Late Interaction Retrieval (2025)2.16
- Context-aware Embeddings For Automatic Art Analysis (2019)12.54
- Understanding, Categorizing And Predicting Semantic Image-text Relations (2019)9.59
- Visual Link Retrieval And Knowledge Discovery In Painting Datasets (2020)12.25
- Museum Painting Retrieval (2021)0.00
- Beyond Visual Semantics: Exploring The Role Of Scene Text In Image Understanding (2019)9.59
- Multi-modal Reasoning Graph For Scene-text Based Fine-grained Image Classification And Retrieval (2020)11.29
- Leveraging Computer Vision Application In Visual Arts: A Case Study On The Use Of Residual Neural Network To Classify And Analyze Baroque Paintings (2022)3.58