透過您的圖書館登入
IP:18.220.106.241
  • 學位論文

利用跨平台基因晶片資料建立預測輻射線劑量之基因圖譜

Development of Gene Expression Signatures for Radiation Exposure Prediction using Cross-Platform Microarray Data

指導教授 : 莊曜宇
共同指導教授 : 蔡孟勳(Mong-Hsun Tsai)

摘要


細胞暴露於游離輻射線會造成DNA雙股斷裂,並且引發DNA傷害。而當人體暴露於游離輻射線時,不但會促使致癌風險增加,引起輻射疾病,甚至在高暴露劑量下,很可能會因為急性效應造成死亡。在核能事故的意外中,如在2011年發生於日本福島的核能事故,若能準確評估人體的暴露劑量,即可在治療初期提供寶貴的參考資訊,並給予相應程度的治療手段,但是至今仍然缺少快速及適當的劑量評估方法。 近年來,有文獻報導關於基因表現圖譜與輻射暴露劑量之間存在的關聯性,並且將其作為預測與區分輻射暴露劑量的依據。但是這些文獻大多只有進行內部資料的測試,而沒有使用其他更多外部資料佐證。這會衍生出許多問題,像是不同基因晶片資料之間存在的系統性誤差,以及基因晶片研究之間的極大差異性。除此之外,細胞在受到輻射刺激後的基因表現,具有細胞株專一性的特性,而過去文獻往往只進行單一細胞株的探討,這會造成篩選到的基因與輻射劑量非直接相關的偽陽性機率大增。為了解決上述的問題,我們使用整合分析(Meta-analysis)的研究方法,蒐集已公開的基因晶片資料進行全面性分析。首先,藉由統計模型的評估,我們從各筆數據集中,分別篩選出差異表現基因。這些被挑選自不同數據集的差異表現基因,將依據它們在不同筆資料中的出現次數與頻率,作更進一步的過濾。接著利用資料學習方法(Machine learning algorithms),如支持向量機(Support vector machine),架構出區分高、低輻射劑量的預測模型。最後,利用內部資料的交叉驗證,以及外部資料進行結果驗證,評估以此基因表現圖譜所建立的預測模型,在不同樣本中是否具有預測及區分輻射劑量的能力。 本研究篩選出由29個基因構成的基因表現圖譜,可以穩定的被應用於區別高輻射劑量( > 8 Gy)與低輻射劑量( < 2 Gy)。在兩種劑量組別分別篩選到的基因,其基因交集程度非常低,但是p53調控路徑在兩個組別中,仍然是主要的生物調控路徑。與過去相關文獻比較,我們篩選出的生物標記具有最理想的預測能力。在組內資料的交叉驗證中,達到85%的整體準確率,與其它文獻的結果相比,顯著提高約6−14%的準確率。在96個外部樣本的驗證中,其它文獻的預測結果皆顯示不穩定且不可靠的準確率,但是在本研究篩選出的基因表現圖譜中,其預測結果仍然達到84%的準確率,顯示其在不同樣本資料的穩定性與準確率。 在本篇研究中,我們整合利用跨平台基因晶片資料,發展一個全面性的研究方法,提供更為穩定及有效的基因預測模型,進行不同輻射劑量的預測,以期在未來用於臨床樣本之劑量評估與分析。

並列摘要


Exposure of ionizing radiation (IR) can cause DNA damages in cells due to DNA double-strand break. It is well known that exposure to IR could lead to the increased risks of cancer and other acute radiation sickness, which may eventually cause rapid death in high exposure dose. In a radiological emergency, such as Fukushima Daiichi nuclear disaster in 2011, exactly evaluating the exposure dose would be highly required for better health care in the early treatment. In recent years, several studies had suggested some gene expression signatures associated with radiation doses, which can serve as effective predictors for the exposed radiation doses by using high-throughput microarray technique. However, most of them only tested the performances in their original studies and no external datasets were further used for validation. These observations clearly revealed that results based on one microarray dataset was prone to suffer from systematic biases, and thus was limited in biological interpretation. On the other hand, most of the studies that only focused on a single cell line typically had high false positive rates since the transcriptional responses to IR were cell line-dependent. To overcome these issues, we proposed a comprehensive approach by performing a meta-analysis of publicly available microarray datasets. First, differentially expressed genes in responding to different radiation doses were identified in the respective dataset by using statistical models. Next, the identified genes that selected from differential expression analyses were further filtering by the frequency in which they are present among different datasets. A machine learning algorithm, support vector machine, was utilized to develop prediction models distinguishing high-dose and low-dose exposure. Lastly, the performances of the meta-signature was evaluated by using cross-validation in internal datasets and independently validated in external studies. In this study, a 29-gene meta-signature was identified that can distinguish cells with either high-dose exposure (> 8 Gy) or low-dose exposure (< 2 Gy). The gene functions related to the apoptosis regulating pathway and the p53 signaling pathway were significantly enriched in the identified meta-signature. Compared to the previous studies, our findings had superior performance in predicting IR exposure levels with a total accuracy of 85% (6–14% higher than previous studies) in the internal cross-validation. Even in an external validation, this meta-signature correctly predicted the samples with an overall accuracy of 84% (81/94). In conclusion, we provide a comprehensive approach to dissect the radiation responses among different doses in the independent studies. The findings may facilitate exploration in biological functions regulated by IR. Above all, the results shown improvement in the robustness of prediction and may be applied into practical usage.

參考文獻


1. Kadhim, M. A., Lorimore, S. A., Townsend, K. M. S., Goodhead, D. T., Buckle, V. J., & Wright, E. G. (1995). Radiation-induced genomic instability: delayed cytogenetic aberrations and apoptosis in primary human bone marrow cells. International Journal of Radiation Biology, 67(3), 287-293.
2. Dent, P., Yacoub, A., Contessa, J., Caron, R., Amorino, G., Valerie, K., et al. & Schmidt-Ullrich, R. (2003). Stress and radiation-induced activation of multiple intracellular signaling pathways 1. Radiation Research, 159(3), 283-300.
3. Iliakis, G., Wang, Y. A., Guan, J., & Wang, H. (2003). DNA damage checkpoint control in cells exposed to ionizing radiation. Oncogene, 22(37), 5834-5847.
4. Ciccia, A., & Elledge, S. J. (2010). The DNA damage response: making it safe to play with knives. Molecular Cell, 40(2), 179-204.
5. Jackson, S. P., & Bartek, J. (2009). The DNA-damage response in human biology and disease. Nature, 461(7267), 1071-1078.

延伸閱讀