透過您的圖書館登入
IP:3.139.97.157
  • 學位論文

檢查隨機對照試驗中非隨機抽樣的統計方法

Exploring Non-random Sampling in Randomized Controlled Trials

指導教授 : 杜裕康
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


研究背景:隨機對照試驗中的基本變項在不同治療組之間應有相似的特徵,若有違反則表示可能數據有異常。本研究主要目的是延伸Dr. Carlisle於2017年所提出的數據檢查方法,探討該方法之使用限制,並將該方法應用於網絡統合分析中隨機對照試驗數據正確性的檢查。 研究方法:本研究分為兩部分,模擬與實際資料分析。在模擬的部分,我們首先在違反方法假設的情境之下生成隨機對照試驗的數據,這些情境包含「變項之間有相關性」、「變項的母群體分布不是常態分布」,以及「摘要性統計量報告不精確」。我們檢定在這些情境下的p值分布是否為預期的均勻分布,以驗證Dr. Carlisle的數據檢查方法在這些情境下是否依舊有效。在實際資料分析的部分,我們檢查Tu (2012)發表的網絡統合分析中納入分析的所有隨機對照試驗的重要的臨床指標基本數據。 研究結果:透過模擬我們發現模擬的三種情境都會影響p值的分布,使其不再是均勻分布,以至於看到偽陽性的機會增加。變項之間的相關性會造成p值群聚的效應,隨著變項數增加會使此效應增強;非常態分布的資料也會影響p值的分布,但隨著樣本數增加,此效應會減弱;當摘要性統計量因四捨五入而報告不精確時,會使p值不再是均勻分布,此效應會隨著樣本數增加而增強。在實際資料分析中,我們發現單看附連高度的p值分布和按照試驗設計分組時,「其他試驗設計」組別的試驗變項p值分布顯著偏離均勻分布,且p值皆偏大。我們推測這是因為試驗設計本身導致不同組的變項之間會更為相近,有可能只是偽陽性、反映出此方法的使用限制,未來還需進一步探討不同試驗設計對方法的影響。 結論:Dr. Carlisle的數據檢查方法僅在變項之間彼此獨立、資料為常態,以及摘要性統計量報告精確時才有效。對於使用此方法檢查出來可能有問題的數據,需進一步確認是否有違反方法本身假設的情況,以避免錯誤解讀偽陽性的結果。

並列摘要


Introduction: “Non-random sampling data” refers to RCT data without balanced baseline covariates between allocation groups, suggesting possible data anomalies. Recently, Carlisle (2017) proposed a screening method to detect possible non-random sampling in RCTs based on the theory that comparisons between allocation groups for baseline variables should produce a uniform distribution of p-values. However, some assumptions underlying this method is commonly violated in RCTs. The aim of the present study was to investigate the impact of violation of these assumptions on the validity of Carlisle’s method in detecting non-random sampling. Methods: Simulations and empirical assessment were conducted to explore the effect of violating method assumptions. In simulations, hypothetical RCT data were generated under the following three assumption-violating scenarios: correlated variables, non-normality data, or imprecisely reported data. P-values were obtained from comparisons between allocation groups using t-test or ANOVA. The validity of Carlisle’s method was determined through checking the uniformity of the p-value distribution. In empirical assessment, we examined the clinically important variables of all RCTs included in network meta-analysis of Tu (2012) and discussed the limitations of applying data detection. Results: Our simulations found inflation of type I error in all assumption-violating scenarios. The clustering effect of correlated variables was amplified when the number of variables increases. The skewed effect of non-normality data was weakened when the sample size increases, according to the central limit theorem. Imprecise report produced more similar data between groups, increasing the chance of a trial being incorrectly detected as unusual. This bias was amplified when the sample size increased. In empirical assessment, we found non-uniformly distributed p-values in CAL and in different study design groups. This result implied possible impact on baseline p-value distribution when applying different randomization designs. Conclusions: Carlisle’s method only performs well if the data are independent, normally distributed, and reported in good precision. Otherwise, with an inflation of type I error, the method is no longer valid. For those unusual RCTs detected by Carlisle’s method, further investigation should be pursued to confirm whether those data did not come from random samples, or the finding is just a false alarm.

參考文獻


1. Casella G, Berger RL. Statistical inference: Duxbury Pacific Grove, CA, 2002.
2. Benford F. The Law of Anomalous Numbers. Proceedings of the American Philosophical Society 1938;78(4):551-72
3. Buyse M, George SL, Evans S, et al. The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. Statistics in Medicine 1999;18(24):3435-51 doi: 10.1002/(sici)1097-0258(19991230)18:24<3435::Aid-sim365>3.0.Co;2-o[published Online First: Epub Date]
4. Nuijten MB, Hartgerink CHJ, van Assen MALM, Epskamp S, Wicherts JM. The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods 2016;48(4):1205-26 doi: 10.3758/s13428-015-0664-2[published Online First: Epub Date]
5. Carlisle JB. Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia 2017;72(8):944-52 doi: 10.1111/anae.13938[published Online First: Epub Date]

延伸閱讀