Condenser: A Pre-training Architecture For Dense Retrieval
2021 Β· Luyu Gao, Jamie Callan
Abstract
Pre-trained Transformer language models (LM) have become go-to text representation encoders. Prior research fine-tunes deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient text comparison and retrieval. However, dense encoders require a lot of data and sophisticated techniques to effectively train and suffer in low data situations. This paper finds a key reason is that standard LMs' internal attention structure is not ready-to-use for dense encoders, which needs to aggregate text information into the dense representation. We propose to pre-train towards dense encoder with a novel Transformer architecture, Condenser, where LM prediction CONditions on DENSE Representation. Our experiments show Condenser improves over standard LM by large margins on various text retrieval and similarity tasks.
Authors
(none)
Tags
Stats
Related papers
- Pre-train A Discriminative Text Encoder For Dense Retrieval Via Contrastive Span Prediction (2022)10.21
- Pre-training Vs. Fine-tuning: A Reproducibility Study On Dense Retrieval Knowledge Acquisition (2025)0.95
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder (2021)14.29
- Dense Text Retrieval Based On Pretrained Language Models: A Survey (2022)15.95
- Unsupervised Context Aware Sentence Representation Pretraining For Multi-lingual Dense Retrieval (2022)3.58
- Unsupervised Dense Retrieval With Conterfactual Contrastive Learning (2024)0.00
- Simlm: Pre-training With Representation Bottleneck For Dense Passage Retrieval (2022)20.27
- MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders Are Better Dense Retrievers (2022)9.97