HEAL: Hierarchical Embedding Alignment Loss For Improved Retrieval And Representation Learning
2024 Β· Manish Bhattarai, Ryan Barron, Maksim Eren, et al.
Abstract
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by integrating external document retrieval to provide domain-specific or up-to-date knowledge. The effectiveness of RAG depends on the relevance of retrieved documents, which is influenced by the semantic alignment of embeddings with the domain's specialized content. Although full fine-tuning can align language models to specific domains, it is computationally intensive and demands substantial data. This paper introduces Hierarchical Embedding Alignment Loss (HEAL), a novel method that leverages hierarchical fuzzy clustering with matrix factorization within contrastive learning to efficiently align LLM embeddings with domain-specific content. HEAL computes level/depth-wise contrastive losses and incorporates hierarchical penalties to align embeddings with the underlying relationships in label hierarchies. This approach enhances retrieval relevance and document classification, effectively reducing hallucinations
Authors
(none)
Tags
Stats
Related papers
- Re-ranking The Context For Multimodal Retrieval Augmented Generation (2025)0.00
- Multi-head RAG: Solving Multi-aspect Problems With Llms (2024)0.00
- Domain-aware RAG: Mol-enhanced RL For Efficient Training And Scalable Retrieval (2025)0.00
- LMAR: Language Model Augmented Retriever For Domain-specific Knowledge Indexing (2025)1.57
- Hetarag: Hybrid Deep Retrieval-augmented Generation Across Heterogeneous Data Stores (2025)3.27
- LUMA-RAG: Lifelong Multimodal Agents With Provably Stable Streaming Alignment (2025)0.00
- Advancing Retrieval-augmented Generation For Structured Enterprise And Internal Data (2025)1.20
- Retrieval-augmented Perception: High-resolution Image Perception Meets Visual RAG (2025)0.00