引言 憂鬱症是一種常見的心理疾病,罹病人口在世界上越來越多,根據WHO統計,2019年約有3.8%世界總人口為憂鬱症患者。憂鬱症常見的治療方法包含:心理治療、藥物治療以及電療,其中以藥物治療為最普遍的治療方針。抗憂鬱藥為治療憂鬱症的主要精神藥物,種類繁多,以選擇性血清素再吸收抑製劑(SSRIs)為最常被使用的第一線抗憂鬱藥物。然而,憂鬱症患者對於SSRIs的治療反應差異非常大。目前臨床上,第一次服用抗憂鬱藥的病患必須花費數個星期的時間驗證醫師處方的抗憂鬱藥物是否對自己有療效。然而,最終約僅有三分之二的患者能對藥物治療有反應。因此,如何盡快辨識所用抗憂鬱藥物對患者是否有效已成為世界共同的重要議題。 在抗憂鬱藥治療效果的研究中,遺傳因子扮演了很重要的角色。目前關於抗憂鬱藥物治療反應的研究中,發現了很多潛在的相關遺傳因子,但是這些特定遺傳因子與抗憂鬱藥物反應的相關性僅在個別研究中被發現,而我們對於在GWAS研究結果中所發現的相關單核甘酸多型態性(SNPs)變異或基因座以及對抗憂鬱藥物反應機轉仍有許多無法釐清的部分。透過GWAS方法研究遺傳變異與性狀的相關性和運作機轉是困難的。相較而言,由基因表現量探究遺傳變異與性狀的相關性較直觀且能減少不確定性。PrediXcan是一種可透過統計方法由單核甘酸多型態性及性狀資料插補出基因表現量的程式。透過此程式,我們便能試圖由插補基因表現量找出抗憂鬱藥物和基因的直接相關性。 我們的研究旨在透過單核甘酸多型態型資料或插補基因表現量資料建立多個對抗憂鬱藥物治療反應的預測模型,並加以比較模型間的預測性能。最後透過預測模型或GWAS找到與抗憂鬱藥物治療反應有關的候選基因。 方法 我們的研究受試者共有567位,分別來自台灣、日本和泰國,並且被醫師依據DSM-IV診斷為憂鬱症的患者。這些患者依醫師處方分別服用escitalopram, citalopram, paroxetine, fluoxetine和fluvoxamine。此外在加入研究時以及每次的追蹤訪談中,我們會依據漢氏憂鬱症量表(HRSD)評估患者的憂鬱症嚴重程度。 預測SSRIs藥物治療反應的四個步驟如下:分層抽樣、GWAS、插補基因表現量、建立預測模型。首先,我們以4:1的比例將受試患者分為測試集與驗證集,並將所有的受試患者以SSRIs藥物種類的使用比例進行分層抽樣。此分層抽樣將進行五次以模擬五倍交叉驗證法。下個步驟則是執行GWAS,並依據GWAS的結果挑選出p值小於0.00001的指標SNPs當作預測因子。插補基因表現量的部分則是由PrediXcan進行預測,我們分別針對全血以及腦部下視丘兩個組織進行基因表現量的插補。預測模型採用研究者於2019年釋出的elastic net 模型;預測模型中所使用的權重則來源於PredictDB資料庫。最後,我們使用隨機森林法對預測模型進行訓練和預測,預測因子則包含人口學變項以及上述的遺傳資料。 結果 在SSRIs治療反應的GWAS結果中,我們發現了五個達到建議顯著水準(p-value<5×10-6)的SNPs。這五個SNPs分別為在XDH基因中的rs35519514、rs8005122,、CD70基因中的rs7252187、ZNF基因中的rs11665781與NRG3基因中的rs12240519。 SNPs資料建立的預測模型其準確率皆達到80%以上,AUC也都大於0.8;由插補基因表現量資料建立的預測模型準確率則有70%以上,AUC大於0.7;合併兩種資料的預測模型準確率則大多在80%以上,AUC也可達到0.8,其預測性能在單純由兩種資料各自建立的預測模型之間。 結論 總結而言,預測性能方面,合併兩種資料的預測模型其預測表現未優於僅由SNPs資料建立的預測模型,可能的原因包含插補基因表現量的權重所使用的參考族群與預測模型內的受試者族群不同,以及受試者人數較少,導致預測力下降。與SSRIs藥物治療反應有相關性的候選基因方面,NRG3基因在我們研究中的預測模型和GWAS的結果皆與治療反應有顯著相關性,可推測NRG3基因有可能影響SSRIs藥物的治療反應,應進行更進一步的研究證實NRG3與SSRIs藥物治療反應的關聯。
Background: Major depressive disorder (MDD) is a common psychiatric disorder worldwide. Among the treatment of depression, antidepressant is the most common choice for MDD patients. There are several types of antidepressants, and the Selective serotonin reuptake inhibitors (SSRIs) are the most commonly pharmacotherapies for MDD. However, SSRIs treatment response is varied from patients to patients. It takes several weeks to verify whether SSRI therapy is effective, and just only about half to two-thirds of patients respond to SSRI therapy. Therefore, it is important issue that how to identify whether patients with MDD respond to SSRIs as soon as possible. The genetic predictors potentially play important role in antidepressants treatment outcome. Nevertheless, most associations with specific genetic variants were not replicated, and susceptibility loci from GWAS and the mechanism are often unclear. It is difficult that the relationship between genetic variants and complex traits is observed in GWAS. However, we could explore the relationship by gene expression. PrediXcan is one of the computational algorithm that predicted the imputed gene expression from single nucleotide polymorphisms (SNPs) genotyping or sequencing data, and the phenotypes. It could help us to integrate GWAS and eQTLs studies for complex traits mapping. Our study aim is that we want to find the candidate genetic variants or gene by GWAS and the prediction models. Next, we tried to use SNP data and impute gene expression data to predict SSRIs treatment response with machine-learning method. We would compare prediction performance between SNPs data model and impute gene expression model for SSRIs treatment response. Methods: All of 567 patients, which come from three countries (Taiwan, Thailand and Japan) in Asia, is diagnosed with DSM-IV MDD from the International SSRI Pharmacogenomics Consortium (ISPC). The patients were treated with escitalopram, citalopram, paroxetine, fluoxetine, or fluvoxamine according to the judgment of study clinicians. Depression severity was rated using Hamilton Rating Scale for Depression (HRSD) in all participants at baseline and follow-up visits. We built the major prediction models by following steps: stratified sampling, GWAS, gene expression imputation and the prediction modeling. Firstly, we grouped subjects into training group or testing group by using stratified sampling by the proportion of type of antidepressants, and this stratified sampling was repeated five times. Second, we conducted GWAS to select the index SNPs with p-value <0.00005 as be the predictors in SNPs data prediction model. Third, we applied PrediXcan to impute gene expression levels in two tissues, including of whole blood and hypothalamus, from SNPs data. We used elastic net model released in 2019 and the weight in regression model is from the PredictDB database. Lastly, we used the genetic data and demographic variants as predictor to train the random forests models to predict treatment response. Then, we internally validated the trained prediction model using five-cross validation. Results: We found that there were no significant signals (p-value<5×10-8) among five GWAS results. Nevertheless, total of 5 markers reached suggestively significant association (p-value<5×10-6) with SSRIs response. The five markers are the rs35519514 in the XDH gene, rs8005122, rs7252187 mapping to CD70 gene, rs11665781 in the ZNF350 gene and rs12240519 mapping to NRG3 gene. For prediction model based on SNPs data, the accuracy is all above 80%, and AUC of these models are also greater than 0.8. The accuracy for imputed gene expression data model is reached 70% or more and the AUC of the models are all greater than 0.7. For the SNPs data and imputed gene expression data prediction model, the accuracy is about 80%. Conclusion: The performance of prediction model based on SNPs data and imputed gene expression data is better than that based on imputed gene expression data, but it is worse than SNPs data prediction model. Besides, the NRG3 gene might be associated with SSRIs treatment response. It is the important predictor in prediction model and it also reached suggestively significant association in GWAS.