Large Scale Question Paraphrase Retrieval With Smoothed Deep Metric Learning
2019 Β· Daniele Bonadiman, Anjishnu Kumar, Arpit Mittal
Abstract
The goal of a Question Paraphrase Retrieval (QPR) system is to retrieve equivalent questions that result in the same answer as the original question. Such a system can be used to understand and answer rare and noisy reformulations of common questions by mapping them to a set of canonical forms. This has large-scale applications for community Question Answering (cQA) and open-domain spoken language question answering systems. In this paper we describe a new QPR system implemented as a Neural Information Retrieval (NIR) system consisting of a neural network sentence encoder and an approximate k-Nearest Neighbour index for efficient vector retrieval. We also describe our mechanism to generate an annotated dataset for question paraphrase retrieval experiments automatically from question-answer logs via distant supervision. We show that the standard loss function in NIR, triplet loss, does not perform well with noisy labels. We propose smoothed deep metric loss (SDML) and with our experimen
Authors
(none)
Tags
Stats
Related papers
- Hyperbolic Representation Learning For Fast And Efficient Neural Question Answering (2017)12.61
- Efficient Passage Retrieval With Hashing For Open-domain Question Answering (2021)15.77
- Enhancing Question Answering Precision With Optimized Vector Retrieval And Instructions (2024)0.00
- Dense Passage Retrieval In Conversational Search (2025)0.00
- Synthetic Target Domain Supervision For Open Retrieval QA (2022)4.52
- Text Embeddings For Retrieval From A Large Knowledge Base (2018)4.52
- Cross Modal Retrieval With Querybank Normalisation (2021)14.06
- QDER: Query-specific Document And Entity Representations For Multi-vector Document Re-ranking (2025)0.00