設計一個新的索引方式以提昇機率天際線的查詢效率

天際線計算近年來被廣泛的應用在數據分析上,但隨著資料量增加、須考慮的因素更為多元或是資料類型的轉變等因素，都大大增加運算上的困難，因此越來越多人針對天際線計算去設計出更佳的演算法，以達到更佳的效能。過往在做天際線的運算時，不論是確定性資料或不確定性資料，最常使用的索引技巧大部分為R-Tree，但因為R-Tree常會有重疊的現象發生，往往造成修剪上的效率不佳，因此有人提出Z-Tree的方式，藉此成功地改善了這樣的現象。本論文所提出新的索引方式類似於Z-Tree，主要是針對索引中的節點擺放順序做改變，如此一來可以降低節點彼此間比較的次數，最後的實驗中顯示：在效能上我們的方法將隨著資料量增大或是空間維度升高明顯的提升。

關鍵字

索引；天際線；機率性天際線；不確定性資料

並列摘要

Computing skyline has been widely used in data analysis in recent years, but it has greatly increased the difficulties in the computation with the variety of factors, such as increase in the amount of data, or variation in the type of data… etc., so more and more people try designing a better algorithm to achieve better performance. R-Tree was frequently used when computing skyline on whether certain data or not uncertain data, but it often has overlap between any two nodes result in the inefficiency of pruning, so it was proposed Z-Tree to improve this phenomenon. This paper propose a new way that is similar to Z-Tree, it mainly changes order of placing node in memory to achieve what reducing number of dominant test between any two nodes in high dimensions, the last experiment shows that the performance of the way will be signifi-cantly improved with the increase in the amount of data or dimension.

並列關鍵字

Index ； Skyline ； Probabilistic Skylines ； Uncertain Data

參考文獻

[1] Stephan Borzsonyi, Donald Kossmann and Konrad Stocker, “The Skyline Opera-tor,” ICDE, 2001, 421 - 430.

[2] Charu C. Aggarwal and Philip S. Yu. “A sUSvey of uncertain data algorithms and applications”. IEEE Transaxtions on Knowledge and Data Engineering, 2009, 609 - 623.

[5] Mikhail J. Atallah , Yinian Qi , “Computing All Skyline Probabilities for Uncertain Data,” ACM, 2009, 279 - 287.

[7] A. Guttman, "R-trees: a dynamic index structure for spatial," in SIGMOD, 1984, 47 - 57.

[3] Sarvjeet Singh, Chris Mayfield, Rahul Shah, Sunil Prabhakar, Susanne Hambru-sch,Jennifer Neville and Reynold Cheng, “Database Support for Probabilistic At-tributes and Tuples,” In Proceedings of 24th International Conference on Data En-gineering IEEE, 2008.

Google Scholar

國際替代計量

設計一個新的索引方式以提昇機率天際線的查詢效率

未授權

主題瀏覽