On The Interpolation Of Contextualized Term-based Ranking With BM25 For Query-by-example Retrieval
2022 Β· Amin Abolghasemi, Arian Askari, Suzan Verberne
Abstract
Term-based ranking with pre-trained transformer-based language models has recently gained attention as they bring the contextualization power of transformer models into the highly efficient term-based retrieval. In this work, we examine the generalizability of two of these deep contextualized term-based models in the context of query-by-example (QBE) retrieval in which a seed document acts as the query to find relevant documents. In this setting -- where queries are much longer than common keyword queries -- BERT inference at query time is problematic as it involves quadratic complexity. We investigate TILDE and TILDEv2, both of which leverage BERT tokenizer as their query encoder. With this approach, there is no need for BERT inference at query time, and also the query can be of any length. Our extensive evaluation on the four QBE tasks of SciDocs benchmark shows that in a query-by-example retrieval setting TILDE and TILDEv2 are still less effective than a cross-encoder BERT ranker. H
Authors
(none)
Tags
Stats
Related papers
- Injecting The BM25 Score As Text Improves Bert-based Re-rankers (2023)10.48
- CEQE: Contextualized Embeddings For Query Expansion (2021)10.35
- How Different Are Pre-trained Transformers For Text Ranking? (2022)7.81
- Improving Transformer-kernel Ranking Model Using Conformer And Query Term Independence (2021)7.16
- Transfer Learning Approaches For Building Cross-language Dense Retrieval Models (2022)10.97
- Improving Bert-based Query-by-document Retrieval With Multi-task Optimization (2022)9.92
- Incorporating Query Term Independence Assumption For Efficient Retrieval And Ranking Using Deep Neural Networks (2019)0.00
- Shallow Cross-encoders For Low-latency Retrieval (2024)2.26