透過您的圖書館登入
IP:18.118.1.158
  • 期刊

Internal Data Reliability Assessment of Unified Heterogeneous Data

摘要


Data cleaning and data integration are conventional steps of data preprocessing stage: various schemas, missing values, recording errors are the major problems to deal with at this stage. Cleaned data then can be adopted for further analysis using data mining techniques. However, data veracity or data reliability may not take into consideration in previous studies. This may result in incorrect query results and faulty conclusion deduction. This paper proposes a data reliability assessment for unified data from heterogeneous databases. The characteristics of heterogeneous data such as the degree of record overlapping or values differences are also investigated. The proposed assessment can be applied to achieve completeness, robustness, and conciseness of the integrated data.

延伸閱讀