透過您的圖書館登入
IP:3.15.211.107
  • 期刊
  • OpenAccess

Discovering Entity Columns of Web Tables Effectively and Efficiently

摘要


Compared with traditional relational tables, web tables have no designated key attributes or entity columns, which make them difficult for machines to understand. The effectiveness of existing methods for entity column detection usually depends on the coverage of knowledge base, and efficiency of traversing knowledge base is low. In this paper, we propose a novel framework for discovering entity columns in web tables based on approximate primary functional dependency. We build the table schema dependency graph to reflect semantic dependency relationships between columns of a web table. By calculating the importance of each attribute node in the table schema dependency graph iteratively based on LeaderRank, our method can detect entity columns accurately and efficiently for both single-entity tables and multi-entity tables. The experimental results on real web datasets show that our method significantly outperforms previous work in both effectiveness and efficiency, especially for large tables.

延伸閱讀