How A General-purpose Commonsense Ontology Can Improve Performance Of Learning-based Image Retrieval
2017 Β· Rodrigo Toro Icarte, Jorge A. Baier, Cristian Ruz, et al.
Abstract
The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense knowledge over relevant aspects of the world, including useful visual information, e.g.: "a ball is used by a football player", "a tennis player is located at a tennis court". Current state-of-the-art approaches for visual recognition do not exploit these rule-based knowledge sources. Instead, they learn recognition models directly from training examples. In this paper, we study how general-purpose ontologies---specifically, MIT's ConceptNet ontology---can improve the performance of state-of-the-art vision systems. As a testbed, we tackle the problem of sentence-based image retrieval. Our retrieval approach incorporates knowledge from ConceptNet on top of a large pool of object detectors derived from a deep learning technique. In our experiments, we show that ConceptNet can improve performance on a common benchmark dataset. Key to our performance is the use of the ESPG
Authors
(none)
Tags
Stats
Related papers
- Bridging The Gap Between Local Semantic Concepts And Bag Of Visual Words For Natural Scene Image Retrieval (2022)2.26
- The Curious Layperson: Fine-grained Image Recognition Without Expert Labels (2021)9.99
- Object-centric Open-vocabulary Image-retrieval With Aggregated Features (2023)0.00
- Visualsem: A High-quality Knowledge Graph For Vision And Language (2020)14.39
- Ontology-aware Network For Zero-shot Sketch-based Image Retrieval (2023)6.34
- End-to-end Learning Of Deep Visual Representations For Image Retrieval (2016)19.66
- Unicom: Universal And Compact Representation Learning For Image Retrieval (2023)5.70
- Cross-modal Retrieval For Knowledge-based Visual Question Answering (2024)7.81