透過您的圖書館登入
IP:3.134.104.173
  • 學位論文

核醣核酸特徵對預測微型核醣核酸目標基因之研究

A Study of RNA Features for MicroRNA Target Prediction

指導教授 : 黃乾綱
共同指導教授 : 張天豪(Tien-Hao Chang)

摘要


微型核醣核酸(microRNA)屬於非編碼核醣核酸(non-coding RNA),平均約22核苷酸長。在基因轉錄過程中,微型核醣核酸能抑制目標基因轉譯成蛋白質,進而影響許多重要的生物反應。然而,經由生物實驗尋找微型核醣核酸目標基因需要耗費大量的時間及成本,許多科學家正努力開發預測目標基因演算法來協助實驗進行。根據目前已知的生物知識,現有預測工具運用微型核醣核酸作用的特徵作為預測準則,而這些特徵可分為六大類:Seed區段互補性(Seed Complementarity),結合體熱動力穩定性(Thermodynamic Stability for Duplex),區段可鍵結性(Site Accessibility),演化保留性(Evolutionary Conservation),序列位置特性(Site Location),與多重鍵結特性(Multiplicity of Binding Sites)。 本研究收集之前研究提出的六類核醣核酸特徵,加上Non Watson-Crick Pairing與Compactness特徵,彙整成八類,配合支援向量機(Support Vector Machine, SVM)與亂數森林(Random Forest)兩種分類演算法,對人類的微型核醣核酸進行預測,並透過特徵挑選來評估各類特徵在預測目標基因的重要性。實驗指出,相較於其他現有預測方法,本方法在預測目標基因有最佳的整體效能:準確度(Precision) 91.4%,正確度(Accuracy) 78.5%,敏感度(Sensitivity) 79.1%,特異性(Specificity) 76.3%。由RELIEF-F方法的特徵挑選(Feature Selection)結果,得知各項特徵的重要性,其中以微型核醣核酸與核酸核酸鍵結的最小自由能(Minimum Free Energy, MFE)為最重要的特徵。另外,本研究中新加入的目標基因的核苷酸組成(Composition)與區段可鍵結性(Site Accessibility)在預測微型核醣核酸目標基因中,亦扮演著相當重要的特徵。

並列摘要


MicroRNAs (miRNAs), which are belonged to small non-coding RNA molecules, play an important role in post transcriptional gene regulation. MiRNAs suppress the translation of target genes to proteins, leading to affect many follow-up biological interactions. Computational methods of miRNA target prediction have been developed to reduce costly and time-consuming biochemical experiments. According to currently known knowledge of biology, six primary attributes of miRNA-mRNA interaction are employed in the approaches of miRNA target prediction: seed complementarity, thermodynamic stability for duplex, site accessibility, evolutionary conservation, site location and multiplicity of binding sites. In our study, we propose a comprehensive method depending on eight feature categories including the six feature categories and two proposed categories, non Watson-Crick pairing and compactness. We extract these features and utilize two machine learning based algorithms, Support Vector Machine (SVM) and Random Forest, as the classifiers to predict human miRNA targets. Incorporated the training and independent testing datasets, we evaluate our performance compared with other current miRNA target prediction methods and demonstrate the importance of RNA features for miRNA target prediction by RELIEF-F method in feature selection. The results of our method outperform other predictors in the comparisons of performance, with the evaluation indexes: precision of 91.4%, accuracy of 78.5%, sensitivity of 79.1%, and specificity of 76.3%. Moreover, as the result of RELIEF-F scores in feature selection, minimum free energy (MFE) of miRNA-mRNA duplex is the most significant RNA feature in miRNA target prediction, and the importance of composition and site accessibility are shown as well.

參考文獻


1. Bartel, D.P., MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 2004. 116(2): p. 281-97.
2. Hammell, M., Computational methods to identify miRNA targets. Vol. 21. 2010. 738-44.
3. Maziere, P. and A.J. Enright, Prediction of microRNA targets. Drug Discov Today, 2007. 12(11-12): p. 452-8.
4. Vapnik, V., The nature of statistical learning theory. 2000: Springer Verlag.
5. Mendes, N.D., A.T. Freitas, and M.F. Sagot, Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res, 2009. 37(8): p. 2419-33.

延伸閱讀