透過您的圖書館登入
IP:3.147.73.35
  • 學位論文

球型索引:一個在高維度不確定性資料下的有效查詢索引結構

Sphere Index: An Effective Query Process For High Dimensional Uncertain Data

指導教授 : 劉傳銘
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


不確定性資料有別於傳統的確定性資料,由於不確定性資料的不確定特性,造成了在資料管理以及運算上,增加了計算上的複雜度和龐大的計算量,因此對於不確定性資料的管理、分析和索引是個重要的研究議題。常用的索引空間資料的索引結構,如R-tree,亦可用來索引不確定性資料,但其效能會隨著資料量的成長而趨於劣化,且當資料維度增加時,索引的效能也令人無法滿意。在這篇論文中筆者提出了一個新的索引不確定性資料的架構—S-Index (Sphere Index),在進行資料查詢時擁有較佳的效率,另一方面在高維度的資料集上,亦能有好的表現。S-Index可支援多種查詢型態,如:單點查詢(Point Query)、機率性範圍查詢(Probabilistic Range Query)和機率性最近相鄰者查詢(Probabilistic Nearest Neighbor Query)。最後,實驗使用模擬的不確定性資料集來進行,以記憶體存取數和CUP花費時間的角度,來觀察、分析、與驗證S-Index索引結構之效率,其結果合乎預期目的。

並列摘要


The uncertain datum is more difficult than certain datum in process. It need more huge calculate and more complex compute. Because the relation of uncertain in uncertain datum. The uncertain may be an estimates, statistics or probability value. And there have several importance issues of analysis, management and index in uncertain data. The R-Trees[6] is general index used in spate data. Also the R-Trees could apply in uncertain data. But, it has a fatal defect in nature. The index efficiency will degraded when data grow up. It is poor performance for each query, because the index will index more invalid space in internal node. And spend more unnecessary memory accesses and CUP times. Thus, here propose a new index structure Sphere-Index in this paper. It is shortest call for S-Index. There were not exit overlay between index MBRs of each internal in S-Index. So it will be improve and effect for query processing. In the other hand, S-Index has better performance than R-Trees in high dimensional data. S-Index supply more type of query processing (pint query, probabilistic range query and probabilistic Nearest Neighbor query). The other issue of update will be discussed in the paper, too. The simulation experiment is the end of this paper. Experiment is use simulation data set, and use the evaluate criterion by memory accesses and CUP times. The Experiment will be prove and verify the effective of S-Index.

參考文獻


[16] Yasushi Sakurai, Masatoshi Yoshikawa, Shunsuke Uemura and Haruhiko Kojima, " The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Ap-proximation", Proceedings of the 26th International Conference on Very Large Data Bases, pp.516-526, 2000.
[17] Ying Zhang, Wenjie Zhang, Qianlu Lin and Xuemin Lin, "Effectively Indexing the Multi-Dimensional Uncertain Objects for Range Searching", Proceedings of the 15th International Conference on Extending Database Technology, pp.504-515, 2012.
[11] Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel, "The X-tree: An Index Structure for High-Dimensional Data", Proceedings of the 22th International Con-ference on Very Large Data Bases, pp.28-39, 1996.
[7] G′sli R. Hjaltason and Hanan Samet, "Distance Browsing in Spatial Databases", Proceedings of the ACM Transactions on Database Systems, 24(2): pp. 265–318, 1999.
[15] Volker Gaede and Oliver Gunther, "Multidimensional Access Methods", Proceed-ings of the ACM Computing Surveys, 30(2): pp.170-231, 1998.

延伸閱讀