透過您的圖書館登入
IP:3.144.113.197
  • 學位論文

最重子序列的快速演算法

Efficient Algorithms for some Heaviest Subsequence Problems

指導教授 : 趙坤茂 老師

摘要


最長共同子序列(LCS)問題以及最長遞增子序列(LIS)問題是兩個很古典的 問題,許多有趣的問題都從這兩個問題延伸。在計算生物學上,許多的研究顯示 出在兩個或多個基因組序列上存在一些特定的關係。在最長遞增共同子序列 (LCIS)問題上,他們想要在所有共同遞增子序列中,找出最長的那一段。在 一套完全基因組序列比對的工具MUMmer 中,最長共同遞增子序列問題可以用 來找出最有可能性的最大唯一配對子字串 (MUM) 群,並且在相同的順序。在 MUMmer 中每一個MUM 給予相同的權重,我們提出一個較一般性的表達,使 得這些MUMs 有不同權重。 第一個部份我們介紹最重遞增子序列(HIS)問題。給予一個長m的序列以 及一個權重函數,我們想要找出遞增子序列而且擁有最重的權重總和,我們提出 了一個演算法可以在O( mloglogm+Sort(m))的時間以及O(m)空間計算出來。 第二個部份我們介紹最重共同子序列(HCS)問題。有兩條序列分別長m 及n 且m>n,而在這兩條序列當中,如果有相同元素配對發生,則在他們之間 存在一個權重,在此我們將描述一個由Jacobson 提出的演算法。 第三個部份我們提出了一個新問題叫做最重共同遞增子序列(HCIS)問題, 這問題是最長共同子序列問題的延伸。我們先提出一個動態配置演算法可以在 O(mnlogn)的時間以及O(mn)的空間找出最重共同遞增子序列。之後我們更提出 了一個演算法可以在O(rnloglogn)的時間以及O(mn)的空間,其中r 是兩序列間 的配對數。 最後,我們不允許元素間重疊,並創造了新問題:保存最重子序列(CHS) 問題及保存最重遞增子序列(CHIS)問題,它們均可在O(mlogm)的最佳時間解 決。

並列摘要


We study some variants of the longest common subsequence (LCS) and the longest increasing subsequence (LIS) problems. In computational biology, some researches have shown that there are some relationships between two common genes from two or more different genomes. In the longest common increasing subsequence (LCIS) problem, they want to find one of the common increasing subsequences with maximum length between two different sequences. The LCIS problem is to find the longest maximal unique matching (MUM) substrings in the same order among three or more genomes in a whole genome alignment tool MUMmer and each MUM has weight one. We give a more general formulation for this problem such that each MUM has a different weight. In the first part, we study the heaviest increasing subsequence (HIS) problem, and propose an O(mlog logm+ Sort(m)) time and O(m) space algorithm for it, where m is the length of the input sequence and Sort(m) denotes the time required to sort a sequence with length m. In the second part, we study the heaviest common subsequence (HCS) problem and describe the algorithm proposed by Jacobson et al [17]. In the third part, for two sequences of length m and n, we define a new problem called the heaviest common increasing subsequence (HCIS) problem as an extension from the LCIS problem. We present an algorithm for computing the HCIS in O(mn log n) time and O(mn) space. In addition, we propose another algorithm in O(rn log log n) time and O(mn) space, where r is the number of the common elements. Finally, we add a new conserved variation, and define the conserved heaviest subsequence (CHS) and conserved heaviest increasing subsequence (CHIS) problem, which can be solved in O(mlogm) time.

並列關鍵字

HCS HIS HCIS CHS

參考文獻


[1] D. Aldous and P. Diaconis. Longest increasing subsequence: From patience sort to
Society, 36(4):413-432, 1999.
not always difficult. 5th International Workshop on Algorithms in Bioinformatics
intervals of K permutations, with applications to modular decomposition of graphs.
[4] L. Bergroth, H. Hakonen, and T. Raita. A survey of longest common subsequence

延伸閱讀