Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm
2025 Β· Jianting Tang, Dongshuai Li, Tao Wen, et al.
Abstract
In modern e-commerce search systems, dense retrieval has become an indispensable component. By computing similarities between query and item (product) embeddings, it efficiently selects candidate products from large-scale repositories. With the breakthroughs in large language models (LLMs), mainstream embedding models have gradually shifted from BERT to LLMs for more accurate text modeling. However, these models still adopt direct-embedding methods, and the semantic accuracy of embeddings remains inadequate. Therefore, contrastive learning is heavily employed to achieve tight semantic alignment between positive pairs. Consequently, such models tend to capture statistical co-occurrence patterns in the training data, biasing them toward shallow lexical and semantic matches. For difficult queries exhibiting notable lexical disparity from target items, the performance degrades significantly. In this work, we propose the Large Reasoning Embedding Model (LREM), which novelly integrates reaso
Authors
(none)
Tags
Stats
Related papers
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- Retrieval-grpo: A Multi-objective Reinforcement Learning Framework For Dense Retrieval In Taobao Search (2025)0.00
- Scaling Laws For Embedding Dimension In Information Retrieval (2026)0.00
- Rethinking Hybrid Retrieval: When Small Embeddings And LLM Re-ranking Beat Bigger Models (2025)0.00
- Reasoning Guided Embeddings: Leveraging MLLM Reasoning For Improved Multimodal Retrieval (2025)0.00
- Expandr: Teaching Dense Retrievers Beyond Queries With LLM Guidance (2025)3.25
- Lexsembridge: Fine-grained Dense Representation Enhancement Through Token-aware Embedding Augmentation (2025)2.35
- MRSE: An Efficient Multi-modality Retrieval System For Large Scale E-commerce (2024)0.00