Strubert: Structure-aware BERT For Table Search And Matching
2022 · Mohamed Trabelsi, Zhiyu Chen, Shuo Zhang, et al.
Abstract
A large amount of information is stored in data tables. Users can search for data tables using a keyword-based query. A table is composed primarily of data values that are organized in rows and columns providing implicit structural information. A table is usually accompanied by secondary information such as the caption, page title, etc., that form the textual information. Understanding the connection between the textual and structural information is an important yet neglected aspect in table retrieval as previous methods treat each source of information independently. In addition, users can search for data tables that are similar to an existing table, and this setting can be seen as a content-based table retrieval. In this paper, we propose StruBERT, a structure-aware BERT model that fuses the textual and structural information of a data table to produce context-aware representations for both textual and tabular content of a data table. StruBERT features are integrated in a new end-to-
Authors
(none)
Tags
Stats
Related papers
- Twinbert: Distilling Knowledge To Twin-structured BERT Models For Efficient Retrieval (2020)0.00
- Colbert: Efficient And Effective Passage Search Via Contextualized Late Interaction Over BERT (2020)0.00
- Diagnosing BERT With Retrieval Heuristics (2022)10.21
- CGPT: Cluster-guided Partial Tables With Llm-generated Supervision For Table Retrieval (2026)1.57
- Modeltables: A Corpus Of Tables About Models (2025)2.35
- Table2vec: Neural Word And Entity Embeddings For Table Population And Retrieval (2019)13.55
- Multi-modal Retrieval Of Tables And Texts Using Tri-encoder Models (2021)6.34
- Stark: Benchmarking LLM Retrieval On Textual And Relational Knowledge Bases (2024)5.04