GLEN: Generative Retrieval Via Lexical Index Learning
2023 Β· Sunkyung Lee, Minjin Choi, Jongwuk Lee
Abstract
Generative retrieval shed light on a new paradigm of document retrieval, aiming to directly generate the identifier of a relevant document for a query. While it takes advantage of bypassing the construction of auxiliary index structures, existing studies face two significant challenges: (i) the discrepancy between the knowledge of pre-trained language models and identifiers and (ii) the gap between training and inference that poses difficulty in learning to rank. To overcome these challenges, we propose a novel generative retrieval method, namely Generative retrieval via LExical iNdex learning (GLEN). For training, GLEN effectively exploits a dynamic lexical identifier using a two-phase index learning strategy, enabling it to learn meaningful lexical identifiers and relevance signals between queries and documents. For inference, GLEN utilizes collision-free inference, using identifier weights to rank documents without additional overhead. Experimental results prove that GLEN achieves s
Authors
(none)
Tags
Stats
Related papers
- Continual Learning For Generative Retrieval Over Dynamic Corpora (2023)11.49
- Generative Retrieval Meets Multi-graded Relevance (2024)2.26
- Learning To Rank In Generative Retrieval (2023)11.91
- Learning To Tokenize For Generative Retrieval (2023)4.52
- Generative Retrieval As Multi-vector Dense Retrieval (2024)8.60
- Lightweight And Direct Document Relevance Optimization For Generative Information Retrieval (2025)4.52
- Generative Retrieval As Dense Retrieval (2023)0.00
- Does Generative Retrieval Overcome The Limitations Of Dense Retrieval? (2025)0.00