  • 學位論文


Multiple Hypothesis Testing in Large-scale Association Studies

指導教授 : 蕭朱杏


摘 要 大型相關性研究要推論疾病與成千上萬個標誌基因之間的關係時,常藉由檢定多個標誌基因,如SNP,與疾病間之相關性來找出疾病易感受基因。在檢定過程會遇到多重檢定的問題,此時型Ⅰ誤可能會隨檢定個數增加而變大。傳統Bonferroni法雖不至於增加型Ⅰ誤,但因其過於保守,相對降低檢定力。然而,初步掃描基因的研究首重檢定力,換句話說,降低型Ⅱ誤較符合所需。本論文提出兩階段作法,在第一階段挑選相關的SNP時,不希望錯過相關性只有中等的SNP;亦即調高檢定力,希望可以挑到大部分與疾病相關的SNP。在第二階段則採取較嚴格的顯著性水準,以控制整體偽陽率。在本論文中,將提出兩階段方法的實際操作流程,並探討兩階段方法的統計特性,包括整體偽陽率及偵測力(TPR)的推導。此外,也建議兩階段設計所需的樣本數,以及第一、第二階段樣本數的選擇。然後,將藉由模擬研究評估兩階段方法的表現,並與傳統Bonferroni法比較整體偽陽率及TPR。模擬結果顯示,當病例組與對照組的基因頻率差異越小時,兩階段方法之整體TPR比Bonferroni法好;在整體偽陽率的部分,兩階段方法仍適當控制整體犯偽陽之情況。


多重檢定 兩階段法 Bonferroni法 TPR FPR FDR


Abstract Multiple hypothesis testing is a commonly occurred problem in genome-wide association studies. As the number of markers increases, the overall false positive rate inflates. The traditional Bonferroni correction is so stringent that the overall power is usually small. This may not meet the primary interest of finding the markers of even mild effect. In this thesis, we propose a two-stage selection method to address this problem. The main idea is to maintain a substantial power in the first stage and control the incurred false positives in the second stage. The implementation of the proposed procedure will be provided. Its statistical properties, including the rate of diminishing non-associated SNPs, overall false positive rate, and overall true positive rate, will be derived. In addition, we will recommend the determination of the sample size under each stage. We also illustrate the proposed method with a simulation study, and compare it with Bonferroni method. The two-stage procedure performs better than Bonferroni method even when the difference in marker allele frequency between case and control group is moderate.


Two-stage method Bonferroni method TPR FPR FDR


Botstein, D., White, R.L., Skolinick, M. & Davis, R.W. (1980), “Construction of a genetic linkage map in man using restriction fragment length polymorphisms”, American Journal of Human Genetics, 32, 314-331.
Botstein, D. & Risch, N. (2003), “Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease”, Nature Genetics, 33, 228-237.
Brookes, A.J. (1999), “The essence of SNPs”, Gene, 234, 177-186.
Cardon, L.R. & Bell, J.I. (2001), “Association study designs for complex diseases”, Nature Reviews Genetics, 2, 91-99.
Collins, F.S., Guyer, M.S., & Chakravarti, A. (1997), “Variations on a theme: cataloging human DNA sequence variatoin”, Science, 278, 1580-1581.


