透過您的圖書館登入
IP:3.148.107.255
  • 學位論文

Improved Algorithms for Some Pattern Matching and Vertex-Disjoint Paths Problems

針對字串索引及點互斥路徑問題之改進演算法

指導教授 : 王炳豐

摘要


This dissertation is composed of the following two fundamental research topics in computer science: string matching and finding disjoint paths. In the first part, we study the following three indexing problems: the positional text indexing problem, the position-restricted text indexing problem, and the variable- length don't care text indexing problem. Previous solutions for these indexing problems heavily rely on efficient indexes to the range successor problem. Thus, any improvement on the range successor problem immediately leads to improved results for these indexing problems. In this dissertation, we present three new indexes for the range successor problem. More specifically, we give (1) a succinct n + o(n)-word index with O(log n / loglog n) query time, whose query time improves previous O(n)-word index by an O(loglog n) factor; (2) an O(n loglog n)-word index with O((loglog n)^2) query time, whose space-time product is better than all previous solutions; and (3) an O(n^{1+epsilon})-word index with O(1) query time, which is simpler than the previous O(1)-time index. In addition, our index in (2) can be extended to solve the well-known orthogonal range successor problem in R^3. The extended index needs O(n log^{1+epsilon} n) words and supports O(log n loglog n) query time, improving a long-standing result which uses O(n log^2 n) words with the same query time. Our results on the range successor problem immediately lead to improved results for the above three indexing problems over general alphabets. For real world applications, the alphabet size is usually small. In this dissertation, we also consider these three indexing problems over alphabets of size O(polylog(n)). For the first and third problems, we present optimal O(n)-word indexes with O(p) query time. For the second problem, we show that a query can be answered in O(n log^epsilon n) space and O(p + occ) time, or in O(n) space and O(p + occ log^epsilon n) time. When |Σ| = O(polylog(n)), our indexes are better than all the previous results. In the second part of this dissertation, we consider the following two problems: the problem of finding length-bounded two vertex-disjoint paths in an undirected planar graph, and the problem of finding minmax k vertex-disjoint paths in a directed acyclic graph. For the first problem, an improved algorithm is presented, which requires O(n^3 b_min) time and O(n^2 b_min) space, where b_min is the smaller of the two given length bounds. For the second problem, an improved algorithm and a faster fully polynomial-time approximation scheme are proposed. The proposed algorithm requires O(n^{k+1} M^{k-1}) time and O(n^k M^{k-1}) space, and the proposed approximation scheme requires O((1/epsilon)^{k-1} n^{2k} log^{k-1} M) time and O((1/epsilon)^{k-1} n^{2k-1} log^{k-1} M) space, where epsilon is the given approximation parameter and M is the length of the longest path in an optimal solution.

並列摘要


本篇論文探討了資訊科學領域中相當基礎且重要的兩個研究議題: 字串索引問題及找尋互斥路徑問題。 本論文的第一部分探討下列三個有位置限制的字串索引問題: 特定位置之後的字串索引問題、區間範圍內的字串索引問題、以及包含 variable-length don't care 符號的字串索引問題。過去的結果主要是依賴解決 the range successor problem 的資料結構作為工具,所以只要 range successor 這個問題的資料結構有任何的改進,就可以得到這三個問題的改進結果。在這篇論文中,我們首先針對 range successor 這個問題提出三個新的索引資料結構。更明確地說,我們提出了 (1) 一個使用 n + o(n) 空間且支援 O(log n / loglog n) 查詢時間的資料結構,此結果的查詢時間比先前使用相同空間的資料結構快 O(loglog n) 倍;(2) 一個使用 O(n loglog n) 空間且支援 O((loglog n)^2) 查詢時間的資料結構,此結果的空間時間乘積優於過去所有的結果;以及 (3) 一個使用 O(n^{1+epsilon}) 空間且支援 O(1) 查詢時間的資料結構,此結果比先前有相同複雜度的資料結構簡單許多。此外,第二個資料結構也可以用來解決在 R^3 空間中一個稱為 orthogonal range successor problem 的重要問題,使用 O(n log^{1+epsilon} n) 空間且支援 O(log n loglog n) 查詢時間;這個結果改進了過去已經存在很久的最好結果。 利用我們在 range successor 這個問題上的改進結果,上述三個有位置限制的字串索引問題在任意大小的字符集下都可以被立刻改進。在現實生活的應用中,字符集通常很小,因此在本論文中,我們也探討了在小字符集下的上述三個字串索引問題。當字符集大小是 O(polylog(n)) 時,我們為這些問題提出了更有效率的改進演算法。針對第一個與第三個問題,我們提出了使用 O(n) 空間且支援 O(p) 查詢時間的最佳資料結構。針對第二個問題,我們提出了一個使用 O(n log^epsilon n) 空間且支援 O(p) 查詢時間的資料結構;與一個使用 O(n) 空間且支援 O(p + occ log^epsilon n) 查詢時間的資料結構。當字符集大小是 O(polylog(n)) 時,我們的演算法改進了過去所有的結果。 這篇論文的第二部分探討以下兩個問題: 在一個無向平面圖中找尋兩條有長度限制的點互斥路徑問題、以及在一個有向無循環圖中找尋 k 條點互斥路徑且讓最長路徑最小化的問題。針對第一個問題,我們提出一個改進演算法,時間與空間的複雜度分別為 O(n^3 b_min) 與 O(n^2 b_min),其中 b_min 為兩個給定的長度限制中較小的一個。針對第二個問題,我們提出了一個改進的演算法與一個更快的完全多項式時間近似方案 (fully polynomial-time approximation scheme)。提出的改進演算法使用了 O(n^{k+1} M^{k-1}) 時間與 O(n^k M^{k-1}) 空間;而提出的近似方案使用了 O((1/epsilon)^{k-1} n^{2k} log^{k-1} M) 時間與 O((1/epsilon)^{k-1} n^{2k-1} log^{k-1} M) 空間,其中 epsilon 為給定的近似參數、M 為最佳解中最長路徑的長度。

參考文獻


[2] T. Akutsu, "Approximate string matching with variable-length don't care characters," IEICE Transactions on Information and Systems, vol. E79-D, no. 9, pp. 1353-1354, 1996.
[3] S. Alstrup, G. S. Brodal, and T. Rauhe, "New data structures for orthogonal range searching," in Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000, pp. 198-207.
[4] M. A. Bender and M. Farach-Colton, "The LCA problem revisited," in Proceedings of the 4th Latin American Theoretical Informatics Symposium, 2000, pp. 88-94.
[6] J. L. Bentley, "Multidimensional divide-and-conquer," Communications of the ACM, vol. 23, no. 4, pp. 214-229, 1980.
[7] A. A. Bertossi and E. Lodi, "Parallel string matching with variable length don't cares," Journal of Parallel and Distributed Computing, vol. 22, no. 2, pp. 229-234, 1994.

延伸閱讀