Improving Embedding With Contrastive Fine-tuning On Small Datasets With Expert-augmented Scores
2024 Β· Jun Lu, David Li, Bill Ding, et al.
Abstract
This paper presents an approach to improve text embedding models through contrastive fine-tuning on small datasets augmented with expert scores. It focuses on enhancing semantic textual similarity tasks and addressing text retrieval problems. The proposed method uses soft labels derived from expert-augmented scores to fine-tune embedding models, preserving their versatility and ensuring retrieval capability is improved. The paper evaluates the method using a Q\&A dataset from an online shopping website and eight expert models. Results show improved performance over a benchmark model across multiple metrics on various retrieval tasks from the massive text embedding benchmark (MTEB). The method is cost-effective and practical for real-world applications, especially when labeled data is scarce.
Authors
(none)
Tags
Stats
Related papers
- Efficient Fine-tuning Methodology Of Text Embedding Models For Information Retrieval: Contrastive Learning Penalty (clp) (2024)2.16
- REFINE On Scarce Data: Retrieval Enhancement Through Fine-tuning Via Model Fusion Of Embedding Models (2024)3.58
- Contrastive Learning And Mixture Of Experts Enables Precise Vector Embeddings (2024)0.00
- Refining Joint Text And Source Code Embeddings For Retrieval Task With Parameter-efficient Fine-tuning (2024)0.00
- Nv-retriever: Improving Text Embedding Models With Effective Hard-negative Mining (2024)0.00
- Gistembed: Guided In-sample Selection Of Training Negatives For Text Embedding Fine-tuning (2024)0.00
- Text And Code Embeddings By Contrastive Pre-training (2022)0.00
- Improving Natural-language-based Audio Retrieval With Transfer Learning And Audio & Text Augmentations (2022)0.00