Lifeir At The NTCIR-18 Lifelog-6 Task
2025 Β· Jiahan Chen, da Li, Keping Bi
Abstract
In recent years, sharing lifelogs recorded through wearable devices such as sports watches and GoPros, has gained significant popularity. Lifelogs involve various types of information, including images, videos, and GPS data, revealing users' lifestyles, dietary patterns, and physical activities. The Lifelog Semantic Access Task(LSAT) in the NTCIR-18 Lifelog-6 Challenge focuses on retrieving relevant images from a large scale of users' lifelogs based on textual queries describing an action or event. It serves users' need to find images about a scenario in the historical moments of their lifelogs. We propose a multi-stage pipeline for this task of searching images with texts, addressing various challenges in lifelog retrieval. Our pipeline includes: filtering blurred images, rewriting queries to make intents clearer, extending the candidate set based on events to include images with temporal connections, and reranking results using a multimodal large language model(MLLM) with stronger re
Authors
(none)
Tags
Stats
Related papers
- The State-of-the-art In Lifelog Retrieval: A Review Of Progress At The ACM Lifelog Search Challenge Workshop 2022-24 (2025)0.00
- Visual Lifelog Retrieval Through Captioning-enhanced Interpretation (2025)0.00
- Scenarioclip: Pretrained Transferable Visual Language Models And Action-genome Dataset For Natural Scene Analysis (2025)0.00
- Contextual Media Retrieval Using Natural Language Queries (2016)0.00
- Imagescope: Unifying Language-guided Image Retrieval Via Large Multimodal Model Collective Reasoning (2025)6.34
- Scene Graph Based Image Retrieval -- A Case Study On The CLEVR Dataset (2019)0.00
- Multilevel Language And Vision Integration For Text-to-clip Retrieval (2018)17.67
- Large-scale Pedestrian Retrieval Competition (2019)0.00