透過您的圖書館登入
IP:3.145.65.134
  • 學位論文

以蛋白質分佈之顯微形態結構進行嶄新內質網基因分類

Classification of Novel Endoplasmic Reticulum Genes by Numerical Analysis of Protein Localizations in Micrographs

指導教授 : 蔡育秀

摘要


蛋白質為構成生物體非常重要的基本分子,不同的蛋白質會影響細胞的功能內容與表現,更重要的是,蛋白質需要在對的時間出現在對的地點才能發揮作用。決定蛋白質作用位置的因子為其前端的訊號序列,但是訊號序列很短不容易由基因序列分析得比較觀察到;有研究指出,由於基因序列相近時,其蛋白質分佈亦相似,本研究認為藉由蛋白質分佈影像的分類,將具有類似序列結構的蛋白質依其分佈影像進行粗分,再進行序列比對分析,可以成為找到目標序列更有效的方法。本研究針對內質網蛋白質影像,以數位影像處理技術取得內質網結構特徵,建構基於內質網基因種類的分類系統,期望讓研究人員在分析蛋白質序列前,能夠先獲得分佈形態的粗分結果。 本研究架構的系統利用內質網蛋白質的原始影像、骨架影像和較亮區塊的影像,可以由影像的紋理特徵、外圍網狀特徵及鑲有核醣體的囊狀構造特徵擷取總共23種的內質網影像特徵。在得到所有特徵之後,以SDA找出最佳特徵組合,並利用SVM建立分類模型及未知組別。比較所有訓練組合的特徵選取,遍佈於紋理特徵、網狀骨架特徵及明亮區塊形態特徵,表示本系統所擷取的特徵對於內質網分類是有意義的。目前本系統可以達到訓練已知影像的準確率為93.4%、包含21.4%的未知影像,測試影像的準確率為86.8%、包含26.5%的未知影像,與不加入未知組別的結果相比較,準確率提高了7%左右,而且未知組別在30%內,可以幫助後端研究人員大幅減少分析需要的時間。

關鍵字

SVM 影像分類 蛋白質分佈 內質網

並列摘要


Protein is an important factor for maintaining the normal function of creature. The cell’s functions will be different with different proteins, and the proteins only work when they are in the right place at the right time. There is a signal peptide in the front of a protein to decide its location. The sequence of the signal peptide is very short and hard to find only by sequence analysis, so the analysis of protein distribution in the image will be a good way to do a simple classification of protein functions. The structure of endoplasmic reticulum is highly relative with the cell’s function, and the signal peptide might be found by analysis of ER distribution. The research is going to present a system with digital image processing, feature acquisition and classified model building. The texture features are acquired first from original image, and the image processing steps are comprised with skeletonized image and brighter-area image, to extract the features of network structure and ribosome-studded sheet structure. There are total 23 features, and except one feature is unused for all kinds of training set, all other features are meaningful for ER classification. The result shows the features this system acquired are useful for ER classification. The best accuracy of classification is 93.4% for training set, including 21.4% images in unknown group, and 86.8% for testing set, including 26.5% images in unknown group. Compare the accuracy with unknown group; it’s about 7% higher than the one without unknown group.

參考文獻


[6] 熊家誠, 自動化螢光顯微影像之次細胞結構辨識, 中原大學醫學工程系碩士學位論文, 民國九十三年.
[18] 李玫憶, 細胞次結構影像辨識系統, 中原大學醫學工程系碩士論文, 民國九十四年.
[38] 古添全, 活體細胞中囊泡蛋白質動態特性之自動化定量與定性分析系統, 中原大學生物醫學工程學系博士論文, 民國九十八年.
[39] 林俊志, 自動化粒線體之型態分析系統, 中原大學生物醫學工程學系碩士學位論文, 民國九十九年.
[1] G. Hardiman, “Introduction to Proteomics-Tools for the New Bioloogy,” Expert Rev, Proteomics 1, vol. 1, pp. 9-10, 2004.

被引用紀錄


陳佳駿(2016)。以蛋白質形態分佈建構內質網基因分群之研究〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201600206

延伸閱讀