Integrity And Junkiness Failure Handling For Embedding-based Retrieval: A Case Study In Social Network Search
2023 Β· Wenping Wang, Yunxi Guo, Chiyao Shen, et al.
Abstract
Embedding based retrieval has seen its usage in a variety of search applications like e-commerce, social networking search etc. While the approach has demonstrated its efficacy in tasks like semantic matching and contextual search, it is plagued by the problem of uncontrollable relevance. In this paper, we conduct an analysis of embedding-based retrieval launched in early 2021 on our social network search engine, and define two main categories of failures introduced by it, integrity and junkiness. The former refers to issues such as hate speech and offensive content that can severely harm user experience, while the latter includes irrelevant results like fuzzy text matching or language mismatches. Efficient methods during model inference are further proposed to resolve the issue, including indexing treatments and targeted user cohort treatments, etc. Though being simple, we show the methods have good offline NDCG and online A/B tests metrics gain in practice. We analyze the reasons for
Authors
(none)
Tags
Stats
Related papers
- Embedding-based Retrieval In Facebook Search (2020)18.09
- Unified Embedding Based Personalized Retrieval In Etsy Search (2023)2.26
- Modernizing Facebook Scoped Search: Keyword And Embedding Hybrid Retrieval With LLM Evaluation (2025)0.00
- Enhancing Relevance Of Embedding-based Retrieval At Walmart (2024)7.16
- Taxonomy Of The Retrieval System Framework: Pitfalls And Paradigms (2026)0.00
- Dense Retrievers Can Fail On Simple Queries: Revealing The Granularity Dilemma Of Embeddings (2025)2.86
- Applying Embedding-based Retrieval To Airbnb Search (2026)0.00
- Que2engage: Embedding-based Retrieval For Relevant And Engaging Products At Facebook Marketplace (2023)6.34