透過您的圖書館登入
IP:3.144.43.165
  • 學位論文

離散旋積方法解決精確字串比對問題

The Discrete Convolution Method for Solving the Exact String Matching Problem

指導教授 : 吳誠文 李家同
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在解決精確字串比對問題的許多演算法中,離散旋積方法是一個簡單的方法。在這篇論文中,首先分析一個機率問題:一個樣本字串出現在另一個較長的本文字串中的機率?在論文中假設所有的樣本字串與本文字串都是隨機產生的,依照這個假設,我們得到了一個近似的公式,可以知道當樣本字串的長度增加時,樣本字串在本文字串中出現的機率將快速地下降至0。根據這個結果,論文中提出一個提前終止離散旋積程序的演算法。在離散旋積程序進行的過程中,當我們發現樣本字串的一個字首沒有出現在本文字串中時,可以提前終止離散旋積程序。依照實驗的結果,提前終止的離散旋積演算法,可以有效率的解決精確字串比對問題。除此之外,我們比較了離散旋積演算法,以及同樣用來解決精確字串比對問題的位移加法演算法。根據論文中的討論,我們知道離散旋積與位移加法演算法的概念是相同的。

並列摘要


In this thesis, we introduce discrete convolution method on solving the exact string matching problem. Based on the assumption that all the text and pattern strings are generated randomly, we derived an equation which can approximate the probability of appearing of a pattern string in a text string. From this equation, we see that the probability that a pattern string appears in a text string reduces to 0 quickly as the length of the pattern string increases. Because of this observation, we introduce an algorithm based on the discrete convolution method with early termination. The algorithm terminates as soon as it discovers that a prefix of the pattern string does not appear in the text string. We show that the discrete convolution method with early termination is quite efficient to solve exact string matching problem for randomly generated text and pattern strings. In this thesis, we also show that the shift-add algorithm is equivalent to and can be implemented by the discrete convolution.

參考文獻


[1] Knuth, D.E., Morris, J.H., Pratt V.R. Fast pattern matching in strings, SIAM Journal on Computing, 6(1), 1977, pp.323-350.
[3] Boyer, R.S. and Moore, J.S. A fast string searching algorithm, Communications of the ACM 20, 10, 1977, pp.762-772.
[4] Horspool, R.N. Practical fast searching in strings Software Practice and Experience, Vol.10, No. 6. (1980), pp. 501-506
[6] R. C. T. Lee. String matching, 2011. (Unpublished)
[7] Baeza-Yates, R.A. and Gonnet, G.H. A new approach to text searching, Communications of the ACM, Vol.35(10), 1992, pp.74-82.

延伸閱讀