Typo-robust Representation Learning For Dense Retrieval
2023 Β· Panuthep Tasawong, Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, et al.
Abstract
Dense retrieval is a basic building block of information retrieval applications. One of the main challenges of dense retrieval in real-world settings is the handling of queries containing misspelled words. A popular approach for handling misspelled queries is minimizing the representations discrepancy between misspelled queries and their pristine ones. Unlike the existing approaches, which only focus on the alignment between misspelled and pristine queries, our method also improves the contrast between each misspelled query and its surrounding queries. To assess the effectiveness of our proposed method, we compare it against the existing competitors using two benchmark datasets and two base encoders. Our method outperforms the competitors in all cases with misspelled queries. Our code and models are available at https://github. com/panuthept/DST-DenseRetrieval.
Authors
(none)
Tags
Stats
Related papers
- Typos-aware Bottlenecked Pre-training For Robust Dense Retrieval (2023)5.84
- Analysing The Robustness Of Dual Encoders For Dense Retrieval Against Misspellings (2022)9.59
- Improving The Robustness Of Dense Retrievers Against Typos Via Multi-positive Contrastive Learning (2024)5.84
- Pseudo-relevance Feedback For Multiple Representation Dense Retrieval (2021)12.93
- Learning To Retrieve: How To Train A Dense Retrieval Model Effectively And Efficiently (2020)0.00
- More Robust Dense Retrieval With Contrastive Dual Learning (2021)11.88
- Improving Query Representations For Dense Retrieval With Pseudo Relevance Feedback (2021)12.10
- What Are You Token About? Dense Retrieval As Distributions Over The Vocabulary (2022)8.09