透過您的圖書館登入
IP:18.219.67.169
  • 學位論文

版本文件中對時間查詢最佳化之索引架構的更省空間實作

More Space Efficient and Practical Framework for Time Travel Phrase Queries on Versioned Documents

指導教授 : 韓永楷

摘要


目前版本文件的成長是越來越迅速的。因此回答time-travel phrase的查詢的需求是越來越必要的。目前有很多方法可以提高回答time-travel phrase的查詢的性能。 在本篇論文中,我們參考了"Space-Efficient Index Framework for Time-Travel Phrase Query on Versioned Document"的實作。然而我們發現此實作使用了太多的索引空間,因此我們希望提供出新的實作以降低其對所以空間的需求。為了達成這個目的,我們實作出了兩個新方法:Impl1: compressed suffix tree + 將每個版本視為獨立的檔案 和 Impl2: compressed suffix tree + K^2-Treaps。這兩個方法與郭的實作相比,Impl1減少了63%的索引空間,而Impl2更是減少了80%的索引空間。且這兩種實作也與郭的實作相同,皆支援了time-travel phrase以及Top-k的查詢。

並列摘要


Version documents are growing faster nowadays. Answering time-travel phrase queries is getting more and more necessary. There are lots of methods to improve the performance of answering time-travel phrase queries. In this paper, we reference the implementation of "Space-Efficient Index Framework for Time-Travel Phrase Query on Versioned Document". However, we find that Kuo's implementation use lots of index space, so we want to reduce the index space. In order to achieve this purpose, we have two implementations, Impl1: compressed suffix tree + treating versions as different documents and Impl2: compressed suffix tree + K^2-Treaps. Comparing with the total index of Kuo's implementation, Imple1 reduced by 63% and Impl2 reduced by 80%, and these two implementations also support time-travel phrase queries and Top-k queries.

參考文獻


[7] Shu Yao Chien, Vassilis J Tsotras, Carlo Zaniolo, and Donghui Zhang. Stor- ing and querying multiversion xml documents using durable node numbers. In Web Information Systems Engineering, 2001. Proceedings of the Second International Conference on, volume 1, pages 232–241. IEEE, 2001.
[9] Johannes Fischer, Veli M ̈akinen, and Gonzalo Navarro. Faster entropy-bounded compressed suffix trees. Theoretical Computer Science, 410(51):5354–5364, 2009.
[10] Roberto Grossi and Jeffrey Scott Vitter. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378–407, 2005.
[11] Antonin Guttman. R-trees: A dynamic index structure for spatial searching, volume 14. ACM, 1984.
[14] Wing-Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. Space-efficient frame- work for top-k string retrieval problems. In Foundations of Computer Science, 2009. FOCS’09. 50th Annual IEEE Symposium on, pages 713–722. IEEE, 2009.

延伸閱讀