CLEAR: Cross-lingual Enhancement In Alignment Via Reverse-training
2026 Β· Seungyoon Lee, Minhyuk Kim, Seongtae Hong, et al.
Abstract
Existing multilingual embedding models often encounter challenges in cross-lingual scenarios due to imbalanced linguistic resources and less consideration of cross-lingual alignment during training. Although standardized contrastive learning approaches for cross-lingual adaptation are widely adopted, they may struggle to capture fundamental alignment between languages and degrade performance in well-aligned languages such as English. To address these challenges, we propose Cross-Lingual Enhancement in Retrieval via Reverse-training (CLEAR), a novel loss function utilizing a reverse training scheme to improve retrieval performance across diverse cross-lingual retrieval scenarios. CLEAR leverages an English passage as a bridge to strengthen alignments between the target language and English, ensuring robust performance in the cross-lingual retrieval task. Our extensive experiments demonstrate that CLEAR achieves notable improvements in cross-lingual scenarios, with gains up to 15%, parti
Authors
(none)
Tags
Stats
Related papers
- What Drives Cross-lingual Ranking? Retrieval Approaches With Multilingual Language Models (2025)0.00
- CL2CM: Improving Cross-lingual Cross-modal Retrieval Via Cross-lingual Knowledge Transfer (2023)8.60
- Boosting Data Utilization For Multilingual Dense Retrieval (2025)0.00
- Bridging Language Gaps: Advances In Cross-lingual Information Retrieval With Multilingual Llms (2025)0.00
- Complementing Lexical Retrieval With Semantic Residual Embedding (2020)13.50
- Uclip: Parameter-efficient Multilingual Extension Of Vision-language Models With Unpaired Data (2025)0.00
- Steering Into New Embedding Spaces: Analyzing Cross-lingual Alignment Induced By Model Interventions In Multilingual Language Models (2025)3.58
- Translate-distill: Learning Cross-language Dense Retrieval By Translation And Distillation (2024)8.60