透過您的圖書館登入
IP:3.144.189.177
  • 學位論文

利用不同變異基因點位資訊去預測相關遺傳疾病嚴重程度評分

Predicting the Severity Score for Genetic Disease Using Variants Information from Multiple Mutation types

指導教授 : 賴飛羆
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


疾病嚴重程度評分,是判斷變異點位致病性的其中一個重要的依據,大多數的遺傳性疾病會呈現不同的嚴重程度,也就是遺傳疾病症狀其基因變異所表現的輕重層級的分別,若以變異點位的資訊去預測這些由於遺傳導致的疾病嚴重程度給出評分,這個資訊可以增加醫生診斷的正確率,更進一步可以對於不同疾病的嚴重程度選擇合適的治療的方法。儘管目前台大醫院基因學部已經有一套基於規則的計算疾病嚴重程度的評分方法,但在計算非致病性的基因變異的準確度並不足夠,且對於插入、缺失、置換的變異無法進行預測。在此篇研究中,我針對三種不同的變異種類分別蒐集了三組訓練的數據集,包含導致嚴重症狀的致病變異和不嚴重或無症狀的良性變異,使用六種機器學習的演算方法,各別訓練出對應的模型,嘗試找出最合適的演算法。其中以隨機森林的模型與XGB模型的準確率最高,經過驗證資料後的準確度可以穩定達到約87%,整體上比現在使用的方法提高5%,其中以預測不嚴重或無症狀的良性變異進步最多,可以使原本37%被預測錯誤的良性變異,大幅降低致12%。此次研究的預測模型已經寫成工具可以做使用,若有病人的基因資料,可以使用此工具直接進行嚴重程度評分的預測任務,其嚴重程度評分結果可以縮短變異判讀的時間,提高了醫生跟基因研究人員的效率跟準確率。

並列摘要


Most genetic diseases will show the spectrum of severity, the level of severity of the symptoms of genetic defect. If we use the information of variants interpretation to predict the Severity Score, we can increase the accuracy of diagnosis. Furthermore, we can choose the appropriate treatment for different severity of the disease. Although there already has a fixed-sum scoring method for calculating the severity of diseases in National Taiwan University Hospital, the accuracy of calculating benign variants is not enough, and it is impossible to predict insertions, deletions, and frameshift mutation type. In this research, we collected three training data sets for three different mutation types, including pathogenic variants that cause severe symptoms and benign variants that are not serious or asymptomatic. We try six kinds of machine learning algorithm to train the corresponding models and find the most suitable algorithm separately. The random forest model has the highest accuracy for Startloss and FsIndel, and the XGB model is for SNV. After verifying the data, the accuracy can reach about 87% stably, which is higher than 82% accuracy of the current method. Predicting benign variants have improved the most, which can significantly reduce the wrong predicting from 37% to 12%. We have built a python script to predict severity score for each variant.

並列關鍵字

Machine Learning Variant Severity Score

參考文獻


Abhishek Niroula and Mauno Vihinen, “Predicting Severity of Disease-Causing Variants,” Published online 9 January 2017 in Wiley Online Library (www.wiley.com/humanmutation), doi: 10.1002/humu.23173
Sorting Intolerant From Tolerant Help (https://sift.bii.a-star.edu.sg/www/SIFT_help.html#SIFT_PROCEDURE)
Ivan Adzhubei, Daniel M. Jordan, and Shamil R. Sunyaev, “Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2,” Current Protocols in Human Genetics 7.20.1-7.20.41, January 2013.
Sung Chun and Justin C. Fay, “Identification of deleterious mutations within three human genomes,” Advance Genome Res. 2009. 19: 1553-1561July 14, 2009, doi:10.1101/gr.092619.109
Wiki MutationTaster https://en.wikipedia.org/wiki/MutationTaster

延伸閱讀