← all papers · overview

A Comparative Analysis of Retrieval-Augmented Generation Architectures with Semantic Hashing for Enterprise Knowledge Systems

Abstract

Large Language Models (LLMs) have significant potential to transform enterprise knowledge management. The Retrieval-Augmented Generation (RAG) architecture has become the industry standard for producing accurate, domain-specific responses while mitigating hallucinations. However, the retrieval stage in standard RAG implementations presents challenges in terms of latency, computational cost, and accuracy when scaling to large document collections. This paper presents a systematic comparative analysis of four distinct RAG retrieval architectures evaluated on a real-world enterprise wiki dataset containing 279 documents across 14 categories. The architectures examined include: (1) Dense Retrieval using FAISS, (2) Hybrid Retrieval combining FAISS with BM25, (3) Semantic Hashing-only retrieval, and (4) a novel two-stage hybrid model employing Semantic Hashing for candidate filtering followed by vector-based reranking. We evaluate these architectures using 84 domain-specific questions, measuring answer accuracy through semantic similarity, query latency, and memory consumption. Our experimental results demonstrate that the two-stage hybrid architecture achieves the highest accuracy rate of 82.14% with a mean response time of 2.96 seconds, outperforming both pure dense retrieval (78.57%) and hybrid BM25 approaches (64.29%). We provide detailed analysis of prompt engineering strategies that contributed to these results and discuss the trade-offs between retrieval speed, accuracy, and resource utilization for enterprise-scale RAG deployments.

Related papers