Pre-training With Aspect-content Text Mutual Prediction For Multi-aspect Dense Retrieval
2023 Β· Xiaojie Sun, Keping Bi, Jiafeng Guo, et al.
Abstract
Grounded on pre-trained language models (PLMs), dense retrieval has been studied extensively on plain text. In contrast, there has been little research on retrieving data with multiple aspects using dense models. In the scenarios such as product search, the aspect information plays an essential role in relevance matching, e.g., category: Electronics, Computers, and Pet Supplies. A common way of leveraging aspect information for multi-aspect retrieval is to introduce an auxiliary classification objective, i.e., using item contents to predict the annotated value IDs of item aspects. However, by learning the value embeddings from scratch, this approach may not capture the various semantic similarities between the values sufficiently. To address this limitation, we leverage the aspect information as text strings rather than class IDs during pre-training so that their semantic similarities can be naturally captured in the PLMs. To facilitate effective retrieval with the aspect strings, we p
Authors
(none)
Tags
Stats
Related papers
- A Multi-granularity-aware Aspect Learning Model For Multi-aspect Dense Retrieval (2023)5.24
- Reproducibility Analysis And Enhancements For Multi-aspect Dense Retriever With Aspect Learning (2024)4.26
- Dense Text Retrieval Based On Pretrained Language Models: A Survey (2022)15.95
- Unsupervised Context Aware Sentence Representation Pretraining For Multi-lingual Dense Retrieval (2022)3.58
- Multi-aspect Reviewed-item Retrieval Via LLM Query Decomposition And Aspect Fusion (2024)0.00
- Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm (2025)0.00
- CSPLADE: Learned Sparse Retrieval With Causal Language Models (2025)0.00
- Modeling Sequential Sentence Relation To Improve Cross-lingual Dense Retrieval (2023)1.20