Drop Your Decoder: Pre-training With Bag-of-word Prediction For Dense Passage Retrieval
2024 Β· Guangyuan Ma, Xing Wu, Zijia Lin, et al.
Abstract
Masked auto-encoder pre-training has emerged as a prevalent technique for initializing and enhancing dense retrieval systems. It generally utilizes additional Transformer decoder blocks to provide sustainable supervision signals and compress contextual information into dense representations. However, the underlying reasons for the effectiveness of such a pre-training technique remain unclear. The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints. Building upon this observation, we propose a modification to the traditional MAE by replacing the decoder of a masked auto-encoder with a completely simplified Bag-of-Word prediction task. This modification enables the efficient compression of lexical signa
Authors
(none)
Tags
Stats
Related papers
- Challenging Decoder Helps In Masked Auto-encoder Pre-training For Dense Passage Retrieval (2023)0.00
- MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders Are Better Dense Retrievers (2022)9.97
- Cot-mae V2: Contextual Masked Auto-encoder With Multi-view Modeling For Passage Retrieval (2023)0.00
- Cot-mote: Exploring Contextual Masked Auto-encoder Pre-training With Mixture-of-textual-experts For Passage Retrieval (2023)0.00
- Pre-train A Discriminative Text Encoder For Dense Retrieval Via Contrastive Span Prediction (2022)10.21
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder (2021)14.29
- Lexmae: Lexicon-bottlenecked Pretraining For Large-scale Retrieval (2022)0.00
- Query-as-context Pre-training For Dense Passage Retrieval (2022)7.68