An Efficient Algorithm for Answering Top-k Typical Range Representatives Query

當我們解釋一個新觀念給我們的同事時,最正式的方法就是給它的定義和描述它的特徵。然而,當我們的聽眾是一個小孩時,我們通常會避免用正式的方法因為它難以被瞭解。在這個實例上的一個更好的方法就是先給一個這個觀念的「典型的例子」。一個相似的實例出現在大型資料庫的搜尋中,例如像是我們想要用一些關鍵字在 Google 中做搜尋。當我們搜尋資料庫且取得一組結果時,一般來說我們渴望的是在那些回傳的結果中的一些「典型」的例子。我們為了檢索最典型的資料而定義典型範圍代表者詢問並且設計對此詢問之有效率的演算法。在 RAM 模型之下,我們的演算法會在 O(n log n) 時間內回答此詢問,而且我們猜想這個範圍是最理想的。而在 external-memory 模型之下,我們提出另一個會回答詢問用 O(SORT(n)) I/O 數的演算法,而 SORT(n) 是在此模型下排序 n 筆資料以 I/O 數來計算的下界。

關鍵字

典型；代表者；典型範圍代表者

並列摘要

When we explain a new concept to our colleagues, the most formal way is to give its definition and to describe its attributes. However, if our listener is a child, we will normally avoid the formal way as it is difficult to be understood. A better approach in this case is to first give a “typical example” of the concept. A similar case occurs in searching large databases. For instance when we want to search Google by some keywords, As we search the database and receive a set of result, what we desire in general is some “typical” examples from those reported results. We define the top-k typical range representatives query for retrieving the top-most typical data and design the efficient algorithms for the query. In the RAM model, our algorithm will answer the query in O(n log n) time, where we conjecture the bound is optimal. And in the external-memory model, we propose another algorithm which will answer the query in O(SORT(n)) I/O’s, where SORT(n) is the lower bound on the number of I/O’s to sort n data in the model.

並列關鍵字

Typicality ； Representative ； Typical Range Representatives

參考文獻

[2] Y. Han (2002). Deterministic sorting in O(n log log n) time and linear space. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 602–608.

[4] J. S. Vitter (2008). Algorithms and Data Structures for External Memory. Series on Foundations and Trends in Theoretical Computer Science, volume 2, number 4, pages 305–474.

[1] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein (2001). Introduction to Algorithms, MIT Press.

Google Scholar

[3] M. Hua, J. Pei, A. W. Fu, X. Lin, and H. Leung (2007). Efficiently answering top-k typicality queries on large databases. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), pages 890–901.

Google Scholar

被引用紀錄

杜佳玲（2004）。疼痛教育及放鬆訓練對改善癌痛之立即與中程效果比較〔碩士論文，臺北醫學大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0007-1704200714570930

國際替代計量

An Efficient Algorithm for Answering Top-k Typical Range Representatives Query

全文下載

主題瀏覽