Nv-embed: Improved Techniques For Training Llms As Generalist Embedding Models
2024 Β· Chankyu Lee, Rajarshi Roy, Mengyao Xu, et al.
Abstract
Decoder-only LLM-based embedding models are beginning to outperform BERT or T5-based embedding models in general-purpose text embedding tasks, including dense vector-based retrieval. In this work, we introduce NV-Embed, incorporating architectural designs, training procedures, and curated datasets to significantly enhance the performance of LLM as a versatile embedding model, while maintaining its simplicity and reproducibility. For model architecture, we propose a latent attention layer to obtain pooled embeddings, which consistently improves retrieval and downstream task accuracy compared to mean pooling or using the last <EOS> token embedding from LLMs. To enhance representation learning, we remove the causal attention mask of LLMs during contrastive training. For training algorithm, we introduce a two-stage contrastive instruction-tuning method. It first applies contrastive training with instructions on retrieval datasets, utilizing in-batch negatives and curated hard negative exam
Authors
(none)
Tags
Stats
Related papers
- Llave: Large Language And Vision Embedding Models With Hardness-weighted Contrastive Learning (2025)3.58
- Pooling And Attention: What Are Effective Designs For Llm-based Embedding Models? (2024)0.00
- Training Llms To Be Better Text Embedders Through Bidirectional Reconstruction (2025)0.00
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- Vill-e: Video LLM Embeddings For Retrieval (2026)0.00
- Vidvec: Unlocking Video MLLM Embeddings For Video-text Retrieval (2026)0.00
- Nemotron Colembed V2: Top-performing Late Interaction Embedding Models For Visual Document Retrieval (2026)0.00
- Nv-retriever: Improving Text Embedding Models With Effective Hard-negative Mining (2024)0.00