簡單遞迴網路解析序列符號內藏結構與文義編碼

簡單遞迴網路具有找出序列資料中內藏結構的能力。本篇論文使用簡單遞迴網路來處理基因序列，利用網路預測錯誤高的地方找出蛋白質編碼區域的交界處。另外本篇論文提出一個新的演算法，讓在訓練簡單遞迴網路的時候不僅去改變其權重值還去改變輸入的編碼方式以降低預測錯誤。透過這樣的方式所得到的編碼具有文義編碼的特性，並將其運用在語意搜尋、作者寫作風格分析和詞義消歧上。

關鍵字

簡單遞迴網路基因序列分段；文義編碼；語意搜尋；詞義消歧

並列摘要

Elman network can discover the hidden structure of sequential data. This thesis uses Elman network to process the genome sequence and detects the boundary of the protein coding region according to the prediction error. Moreover, for literal works analysis, it proposes a redesigned Elman network training algorithm to renew the distributed representation in each iteration. The representation will possess the form-based and function-based similarity in certain degree and is used to do semantic search, writing stylish analysis, and solve word sense disambiguation.

並列關鍵字

Elman network ； DNA segmentation ； semantic encoding ； writing stylish analysis ； word sense disambiguation

參考文獻

[1] Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990)

[2] Rumelhart, D.E.: Parallel distributed processing: Explorations. In McClelland, J.L.(eds.), The microstructure of cognition: Foundations. MIT Press, Cambridge, MA (1986)

[3] Elman, J.L.: Distributed representations, simple recurrent networks, and grammaticalstructure. Machine Learning 7, 195–225 (1991)

[6] McCulloch, W., Pitts, W.: A logical calculus of the ideas immanent in nervous activity.Bull. Math. Biophys. 5, 115–133 (1943)

[7] Siegelmann, H.T.: Computation beyond the turing limit. Science, 238(28), April 1995,632–637 (1995)

國際替代計量

簡單遞迴網路解析序列符號內藏結構與文義編碼

全文下載

主題瀏覽