在生物學上,當具有某種功能的蛋白質被製造出來之後就會移動至某個胞器位置上執行它的功能,而可以推測的是它所具有的功能是和蛋白質在細胞內位置上有很高的相關性,因此如果能夠很準確的來預測蛋白質亞細胞位置將有助於來推測蛋白質更詳細的功能,所以要發展一個自動並且可靠的方法來預測蛋白質亞細胞位置是很被受需要,特別是在大量的基因体序列的分析上是不可能就只是利用人工就能解決的問題,而必需要藉著這電腦資訊的力量才能快速有效的來處理這個問題,而目前也已經有一些方法用來解決這樣的問題,但其處理之後的結果都還不令人滿意,在這篇論文裡,預測亞細胞位置是利用貝氏推論的方法(Bayesian inference method)及決策規則(decision rule)上的有效資訊的取得(Information Gain)的概念來處理這樣問題,並且再利用近鄰分類(Nearest Neighbor Classification)方法來更有效的處理掉一些具爭議資料,這樣的作法有滿不錯及有效的結果。
Biologically, the function of a protein is highly related to its subcellular localization. Accordingly, it is necessary to develop an automatic yet reliable method for protein subcellular localization prediction, especially when large-scale genome sequences are to be analyzed. Various methods have been proposed to perform the task. The results, however, are not satisfactory in terms of effectiveness and efficiency. In this paper, the proposed Bayesian inference method and The Information Gain used to observed important information, Moreover, the Nearest Neighbor Classification is considerably effective for subcellular localization prediction in a supervised fashion.