透過您的圖書館登入
IP:18.117.148.105
  • 期刊
  • OpenAccess

評估基於微陣列晶片資料之動態參數基因演算法(GADP)的最適分類器

Assessment of the Optimum Classifier Based on Genetic Algorithm with Dynamic Parameters (GADP) for Microarray Data

摘要


近年來,由於微陣列技術可以同時偵測數以千計基因的表現,因而廣泛地應用在生物資訊領域或臨床診斷。雖然微陣列技術可以同時偵測數以千計基因的表現,但大多數所偵測的基因對於臨床診斷上卻不是如此的重要。因此,為了解決上述的問題,許多新穎的方法被提出用以挑選重要基因。其中,以動態參數基因演算法(genetic algorithm with dynamic parameters,GADP)具有挑出最少的重要基因的數目,及最具分類效率的特性。然而,儘管不同的基因挑選方法,最後皆需要使用分類器(classifier)驗證所挑選的基因是否能夠針對目標類別進行正確的分類。由此可知,選擇一個得以正確的判定患病與否或疾病類別的分類器在正確挑選重要基因的策略中扮演相當重要的角色。因而本研究主要目的為基於動態參數基因演算法(GADP)下比較六種常見的分類器是否會影響驗證所挑選基因的分類結果,並對GADP所挑選的重要基因進行樣本正確分類率的驗證並比較。所使用的分類器包含:支援向量機(support vector machine,SVM),k最近鄰法(k-nearest neighbor,KNN),類神經網路(artificial neural network,ANN),線性判別分析(linear discriminant analysis,LDA),決策樹(decision tree,DT)及單純貝氏分類器(naive Bayes classifier,NB),並進而建議最適合的分類器。

並列摘要


Recently, due to the factor that the microarray technology can be used to measure the expression levels of thousands of genes simultaneously, it has been widely applied in the areas of bioinformatics and clinical diagnosis. Although thousands of genes are measured simultaneously by using microarray, most of them are irrelevant or insignificant for clinical diagnosis or research. Therefore, several novel methods have been proposed to select relevant genes. Among those gene selection methods, the genetic algorithm with dynamic parameters (GADP) could select the fewer genes with higher prediction accuracy. However, the selected genes still need the classifier to verify the ability of the correct classification for the target category. Consequently, it is important to select the suitable classifier that can accurately judge the target category with the selected genes. In this study, six commonly used classifiers with the selected genes will be compared for selecting the suitable classifier based on the GADP algorithm, including support vector machine (SVM), k-nearest neighbor (KNN), artificial neural network (ANN), linear discriminant analysis (LDA), decision tree (DT), and naive Bayes classifier (NB).

參考文獻


Alon, U.,Barkai, N.,Notterman, D. A.,Gish, K.,Ybarra, S.,Mack, D.,Levine, A. J.(1999).Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.Proceedings of the National Academy of Sciences.96,6745-6750.
Bhattacharjee, A.,Richards, W. G.,Staunton, J.,Li, C.,Monti, S.,Vasa, P.,Ladd, C.,Beheshti, J.,Bueno, R.,Gillette, M.,Loda, M.,Weber, G.,Mark, E. J.,Lander, E. S.,Wong, W.,Johnson, B. E.,Golub, T. R.,Sugarbaker, D. J.,Meyerson, M.(2001).Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.Proceedings of the National Academy of Sciences.98,13790-13795.
Boser, B. E.,Guyon, I. M.,Vapnik, V. N.(1992).A training algorithm for optimal margin classifiers.Proceedings of the fifth annual workshop on Computational learning theory.(Proceedings of the fifth annual workshop on Computational learning theory).
Burges, C.J.C.(1998).A Tutorial on Support Vector Machines for Pattern Recognition.Data Mining and Knowledge Discovery.2,121-167.
Cho, J.H.,Lee, D.,Park, J.H.,Lee, I.B.(2003).New gene selection method for classification of cancer subtypes considering within-class variation.FEBS Letters.551,3-7.

延伸閱讀