Deeper -- Deep Entity Resolution
2017 Β· Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, et al.
Abstract
Entity resolution (ER) is a key data integration problem. Despite the efforts in 70+ years in all aspects of ER, there is still a high demand for democratizing ER - humans are heavily involved in labeling data, performing feature engineering, tuning parameters, and defining blocking functions. With the recent advances in deep learning, in particular distributed representation of words (a.k.a. word embeddings), we present a novel ER system, called DeepER, that achieves good accuracy, high efficiency, as well as ease-of-use (i.e., much less human efforts). For accuracy, we use sophisticated composition methods, namely uni- and bi-directional recurrent neural networks (RNNs) with long short term memory (LSTM) hidden units, to convert each tuple to a distributed representation (i.e., a vector), which can in turn be used to effectively capture similarities between tuples. We consider both the case where pre-trained word embeddings are available as well the case where they are not; we presen
Authors
(none)
Tags
Stats
Related papers
- Explore Entity Embedding Effectiveness In Entity Retrieval (2019)4.52
- Dyvo: Dynamic Vocabularies For Learned Sparse Retrieval With Entities (2024)5.84
- Enterpriseem: Fine-tuned Embeddings For Enterprise Semantic Search (2024)0.00
- QDER: Query-specific Document And Entity Representations For Multi-vector Document Re-ranking (2025)0.00
- Generating Explanations To Understand And Repair Embedding-based Entity Alignment (2023)6.34
- How To Reduce The Search Space Of Entity Resolution: With Blocking Or Nearest Neighbor Search? (2022)0.00
- Dense Retrievers Can Fail On Simple Queries: Revealing The Granularity Dilemma Of Embeddings (2025)2.86
- Retrieving Multi-entity Associations: An Evaluation Of Combination Modes For Word Embeddings (2019)0.00