透過您的圖書館登入
IP:18.223.172.252
  • 學位論文

非線性降維方法在微陣列晶片資料的分類及預測

Nonlinear Dimensionality Reduction Approaches for Classification and Prediction of Microarray Data

指導教授 : 鄭錦聰
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


微陣列晶片(Microarray)在生物醫學是一項重大的發明,它可以應用在疾病檢測、毒物分析、基因定序和免疫反應分析等等,同時具備自動化檢測以及可以對數萬個基因做平行檢測,故對於生物學者更是一大利器。此外機器學習是人工智慧的一門,它擁有學習的能力,而屬於機器學習的支持向量機(Support Vector Machine)具有分類(Classification)及迴歸(Regression)分析的能力。本論文提出結合一些非線性降維的方法及支持向量機處理疾病檢測領域的微陣列晶片資料,並針對肺癌及卵巢癌疾病微陣列晶片作分類及預測。 現實生活中許多資料模式都是呈現非線性且複雜的模式,而微陣列晶片的資料更呈現出此非線性和複雜的本質,此外由於疾病檢測使用的基因數量代表了檢測成本,若能找出較少之重要的特徵基因(Feature Genes)可減少檢測所需的基因數量,並同時降低成本且提高預測的準確率。為了找到較少之特徵基因及提高預測的準確率,本論文提出利用非線性降維的方法從大量的基因微陣列晶片資料中找出重要的特徵基因,這些非線性降維的方法有離散平穩小波分析、支持向量迴歸分析及數個對於非線性降維的特性在於保留原來高維空間的鄰邊關係,並將此流型結構嵌入至低維空間之非線性降維的方法進行尋找特徵基因方法,並結合支持向量機的學習理論達到多類疾病微陣列晶片資料的分類與預測的功能。最後,透過電腦模擬驗證所提出方法的有效性。

並列摘要


Microarray is a significant invention in biomedical. It can be applied to disease detection, toxicology analysis, DNA sequencing and immunoassays, etc. Also microarray technology with automated detection and tens of thousands of genes in parallel can detect some diseases. Therefore, microarray could be became a major tool for biologists. Moreover, machine learning is one of artificial intelligence, which has the ability to learn. In machine learning approach, one of popular approaches is support vector machine that can deal with classification and regression analysis. Hence, this thesis proposes using some nonlinear dimensionality reduction methods and support vector machine to deal with microarray on the fields of disease detection for prediction and classification with microarray data of lung and ovarian cancers. In general, the properties of microarray data models are both nonlinear and complex models in real life. Furthermore, the number of genes that uses on microarray for the disease detection represents the costs in applications. That is, find out fewer important feature genes in microarray that could reduce the number of genes used in disease detection, reduce the cost and improve predictive accuracy. Hence, this thesis proposes using nonlinear dimensionality reduction methods to find out the important feature genes from a large number of genes of microarray. These methods included discrete stationary wavelet analysis, support vector regression and several nonlinear dimensionality reduction methods that have the property to retain the original high-dimensional space between adjacent relations, and the manifold structure embedded in that low-dimensional space for find out feature genes. Based on the results of nonlinear dimensionality reduction methods, the propose approach also combined with the learning theory of support vector machine to achieve multicasts disease detection of microarray for the classification and prediction. Finally, verify the effectiveness of the propose approach by computer simulation.

參考文獻


[29] J. T. Jeng, T. T. Lee and Y. C. Lee, 「Classification of Ovarian Cancer based on Intelligent Systems with Microarray Data,」 IEEE International Conference on Systems, Man and Cybernetics, pp. 1053-1058, Waikoloa, HI, Oct. 2005.
[9] G. S. Huang, Y. C. Hung, A. Chen and M. Y. Hong, 「Microarray Analysis of Ovarian Cancer,」 IEEE International Conference on Systems, Man and Cybernetics, pp. 1036-1042, Waikoloa, HI, Oct. 2005.
[2] T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer and D. Haussler, 「Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data,」 Bioinformatics, vol. 16, no. 10, pp. 906-914, 2000.
[3] A. Grossmann and J. Morlet, 「Decomposition of Hardy Functions into Square Integrable Wavelets of Constant Shape,」 SIAM J. on Mathematical Analysis, vol. 15, no. 4, pp. 723-736, 1984.
[4] J. E. Fowler, 「The Redundant Discrete Wavelet Transform and Additive Noise,」 IEEE Signal Processing Letters, vol. 12, no. 9, pp. 629-632, 2005.

延伸閱讀