A Multi-resolution Word Embedding For Document Retrieval From Large Unstructured Knowledge Bases
2019 Β· Tolgahan Cakaloglu, Xiaowei Xu
Abstract
Deep language models learning a hierarchical representation proved to be a powerful tool for natural language processing, text mining and information retrieval. However, representations that perform well for retrieval must capture semantic meaning at different levels of abstraction or context-scopes. In this paper, we propose a new method to generate multi-resolution word embeddings that represent documents at multiple resolutions in terms of context-scopes. In order to investigate its performance,we use the Stanford Question Answering Dataset (SQuAD) and the Question Answering by Search And Reading (QUASAR) in an open-domain question-answering setting, where the first task is to find documents useful for answering a given question. To this end, we first compare the quality of various text-embedding methods for retrieval performance and give an extensive empirical comparison with the performance of various non-augmented base embeddings with and without multi-resolution representation.
Authors
(none)
Tags
Stats
Related papers
- Text Embeddings For Retrieval From A Large Knowledge Base (2018)4.52
- Multi-view Document Representation Learning For Open-domain Dense Retrieval (2022)10.21
- MRNN: A Multi-resolution Neural Network With Duplex Attention For Document Retrieval In The Context Of Question Answering (2019)0.00
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- QAEA-DR: A Unified Text Augmentation Framework For Dense Retrieval (2024)5.24
- Multi-modal Retrieval Of Tables And Texts Using Tri-encoder Models (2021)6.34
- Improving Document Representations By Generating Pseudo Query Embeddings For Dense Retrieval (2021)9.41
- Utilizing Embeddings For Ad-hoc Retrieval By Document-to-document Similarity (2017)0.00