Deep Retrieval At Checkthat! 2025: Identifying Scientific Papers From Implicit Social Media Mentions Via Hybrid Retrieval And Re-ranking
2025 Β· Pascal J. Sager, Ashwini Kamaraj, Benjamin F. Grewe, et al.
Abstract
We present the methodology and results of the Deep Retrieval team for subtask 4b of the CLEF CheckThat! 2025 competition, which focuses on retrieving relevant scientific literature for given social media posts. To address this task, we propose a hybrid retrieval pipeline that combines lexical precision, semantic generalization, and deep contextual re-ranking, enabling robust retrieval that bridges the informal-to-formal language gap. Specifically, we combine BM25-based keyword matching with a FAISS vector store using a fine-tuned INF-Retriever-v1 model for dense semantic retrieval. BM25 returns the top 30 candidates, and semantic search yields 100 candidates, which are then merged and re-ranked via a large language model (LLM)-based cross-encoder. Our approach achieves a mean reciprocal rank at 5 (MRR@5) of 76.46% on the development set and 66.43% on the hidden test set, securing the 1st position on the development leaderboard and ranking 3rd on the test leaderboard (out of 31 teams)
Authors
(none)
Tags
Stats
Related papers
- Airwaves At Checkthat! 2025: Retrieving Scientific Sources For Implicit Claims On Social Media With Dual Encoders And Neural Re-ranking (2025)0.00
- DS@GT At TREC TOT 2025: Bridging Vague Recollection With Fusion Retrieval And Learned Reranking (2026)0.00
- WSDM Cup 2026 Multilingual Retrieval: A Low-cost Multi-stage Retrieval Pipeline (2026)0.00
- Hyrec: Exploring Hybrid-based Retriever For Chinese (2025)0.00
- Domain-adaptive And Scalable Dense Retrieval For Content-based Recommendation (2026)0.00
- An Analysis Of A BERT Deep Learning Strategy On A Technology Assisted Review Task (2021)0.00
- Modernizing Facebook Scoped Search: Keyword And Embedding Hybrid Retrieval With LLM Evaluation (2025)0.00
- Beyond Retrieval: Ensembling Cross-encoders And GPT Rerankers With Llms For Biomedical QA (2025)0.00