一個計算非重疊反轉與轉位距離的有效率演算法

從演化的觀點來看，DNA序列可能會藉由大規模突變(又稱為重組)演化，例如反轉 (用反向互補序列取代DNA序列的某一片斷)與轉位(把DNA序列的某一片段從原來的位置移到另一個位置，或等同於把兩個相鄰但不重疊的DNA序列片斷做互換) 。給予兩個長度皆為n的字串，他們之間的非重疊反轉與轉位距離(又稱為突變距離)被定義為使用最少次數的非重疊反轉與轉位把一個字串轉換成另一個字串。在本論文的研究中，我們提出一個O(n^3)時間與O(n^2)空間的演算法來計算出兩個輸入字串之間的突變距離。事實上，我們的演算法可以被用來設計出一個O(nm^3)時間與O(m^2)空間的演算法去解決非重疊反轉與轉位距離的近似字串比對問題，此問題的目標是要去找出在一個給定的內文中所有子字串，使得這些子字串與另一個給定的樣板之間的突變距離都小於或等於一個給定的門檻值k，其中n為內文長度， m為樣板長度。

關鍵字

演算法；計算生物學；反轉；轉位；突變距離

並列摘要

From evolutionary point of view, DNA sequences may evolve by large-scale mutations (also called rearrangements), such as inversions (i.e., replacing a fragment of DNA sequence by its reverse complement) and transpositions (i.e., moving a fragment of DNA sequence from one location to another or, equivalently, exchanging two adjacent and non-overlapping fragments on DNA sequence). Given two strings of the same length n, the non-overlapping inversion and transposition distance (also called mutation distance) between them is defined as the minimum number of non-overlapping inversion and transposition operations used to transform one string into the other. In this study, we present an O(n^3) time and O(n^2) space algorithm to compute the mutation distance of two input strings. In fact, our algorithm can be used to design an algorithm that can solve the approximate string matching problem under non-overlapping inversion and transposition distance, which is to find all substrings of a given text whose mutation distances from a given pattern are less than or equal to a given threshold k, in O(nm^3) time and O(m^2) space, where n is the length of the text and m is the length of the pattern.

並列關鍵字

Algorithm ； computational biology ； inversion ； transposition ； mutation distance

參考文獻

[1]A. Amir, T. Hartman, O. Kapah, A. Levy, E. Porat: On the cost of interchange rear-rangement in strings. SIAM Journal on Computing, 39 (2009) 1444-1461.

[2]D. Cantone, S. Cristofaro, S. Faro: Efficient string-matching allowing for non-overlapping inversions. Theoretical Computer Science, 483 (2013) 85-95

[3]D. Cantone, S. Faro, E. Giaquinta: Approximate string matching allowing for inversions and translocations. Proceedings of the Prague Stringology Conference 2010, pp. 37-51.

[5]D.-J. Cho, Y.-S. Han, H. Kim: Alignment with non-overlapping inversions and transloca-tions on two strings. Theoretical Computer Science, 575 (2015) 90-101

[6]S. Grabowski, S. Faro, E. Giaquinta: String matching with inversions and translocations in linear average time (most of the time). Information Processing Letters, 111 (2011) 516-520.

國際替代計量

一個計算非重疊反轉與轉位距離的有效率演算法

主題瀏覽