Diagnosable Colbert: Debugging Late-interaction Retrieval Models Using A Learned Latent Space As Reference
2026 · François Remy
Abstract
Reliable biomedical and clinical retrieval requires more than strong ranking performance: it requires a practical way to find systematic model failures and curate the training evidence needed to correct them. Late-interaction models such as ColBERT provide a first solution thanks to the interpretable token-level interaction scores they expose between document and query tokens. Yet this interpretability is shallow: it explains a particular document--query pairwise score, but does not reveal whether the model has learned a clinical concept in a stable, reusable, and context-sensitive way across diverse expressions. As a result, these scores provide limited support for diagnosing misunderstandings, identifying irreasonably distant biomedical concepts, or deciding what additional data or feedback is needed to address this. In this short position paper, we propose Diagnosable ColBERT, a framework that aligns ColBERT token embeddings to a reference latent space grounded in clinical knowledge
Authors
(none)
Tags
Stats
Related papers
- Colbert-att: Late-interaction Meets Attention For Enhanced Retrieval (2026)0.00
- Colbertv2: Effective And Efficient Retrieval Via Lightweight Late Interaction (2021)17.46
- Colbert: Efficient And Effective Passage Search Via Contextualized Late Interaction Over BERT (2020)0.00
- Introducing Neural Bag Of Whole-words With Colberter: Contextualized Late Interactions Using Enhanced Reduction (2022)0.00
- Pylate: Flexible Training And Retrieval For Late Interaction Models (2025)3.58
- Jina-colbert-v2: A General-purpose Multilingual Late Interaction Retriever (2024)5.24
- Video-colbert: Contextualized Late Interaction For Text-to-video Retrieval (2025)5.24
- Col-bandit: Zero-shot Query-time Pruning For Late-interaction Retrieval (2026)0.00