Domain-adaptive And Scalable Dense Retrieval For Content-based Recommendation
2026 Β· Mritunjay Pandey
Abstract
E-commerce recommendation and search commonly rely on sparse keyword matching (e.g., BM25), which breaks down under vocabulary mismatch when user intent has limited lexical overlap with product metadata. We cast content-based recommendation as recommendation-as-retrieval: given a natural-language intent signal (a query or review), retrieve the top-K most relevant items from a large catalog via semantic similarity. We present a scalable dense retrieval system based on a two-tower bi-encoder, fine-tuned on the Amazon Reviews 2023 (Fashion) subset using supervised contrastive learning with Multiple Negatives Ranking Loss. We construct training pairs from review text (as a query proxy) and item metadata (as the positive document) and fine-tune on 50,000 sampled interactions with a maximum sequence length of 500 tokens. For efficient serving, we combine FAISS HNSW indexing with an ONNX Runtime inference pipeline using INT8 dynamic quantization. On a review-to-title benchmark over 826,40
Authors
(none)
Tags
Stats
Related papers
- Hierarchical Structured Neural Network: Efficient Retrieval Scaling For Large Scale Recommendation (2024)0.00
- Deep Retrieval: Learning A Retrievable Structure For Large-scale Recommendations (2020)0.00
- Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm (2025)0.00
- Self-supervised Multi-modal Sequential Recommendation (2023)0.00
- Mine And Refine: Optimizing Graded Relevance In E-commerce Search Retrieval (2026)0.00
- Bridging Language And Items For Retrieval And Recommendation: Benchmarking Llms As Semantic Encoders (2024)0.00
- MRSE: An Efficient Multi-modality Retrieval System For Large Scale E-commerce (2024)0.00
- ESANS: Effective And Semantic-aware Negative Sampling For Large-scale Retrieval Systems (2025)2.26