MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders Are Better Dense Retrievers
2022 Β· Kun Zhou, Xiao Liu, Yeyun Gong, et al.
Abstract
Pre-trained Transformers (\eg BERT) have been commonly used in existing dense retrieval methods for parameter initialization, and recent studies are exploring more effective pre-training tasks for further improving the quality of dense vectors. Although various novel and effective tasks have been proposed, their different input formats and learning objectives make them hard to be integrated for jointly improving the model performance. In this work, we aim to unify a variety of pre-training tasks into the bottlenecked masked autoencoder manner, and integrate them into a multi-task pre-trained model, namely MASTER. Concretely, MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors. Based on it, we integrate three types of representative pre-training tasks: corrupted passages recovering, related passages recovering and PLMs outputs recovering, to characterize t
Authors
(none)
Tags
Stats
Related papers
- Drop Your Decoder: Pre-training With Bag-of-word Prediction For Dense Passage Retrieval (2024)3.58
- Challenging Decoder Helps In Masked Auto-encoder Pre-training For Dense Passage Retrieval (2023)0.00
- Cot-mae V2: Contextual Masked Auto-encoder With Multi-view Modeling For Passage Retrieval (2023)0.00
- Cot-mote: Exploring Contextual Masked Auto-encoder Pre-training With Mixture-of-textual-experts For Passage Retrieval (2023)0.00
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder (2021)14.29
- Investigating Multi-layer Representations For Dense Passage Retrieval (2025)0.00
- Lexmae: Lexicon-bottlenecked Pretraining For Large-scale Retrieval (2022)0.00
- Pre-train A Discriminative Text Encoder For Dense Retrieval Via Contrastive Span Prediction (2022)10.21