成對ROC曲線之確切推論

接收器操作特徵曲線(receiver operating characteristic curve)簡稱ROC曲線，是目前廣為用來評估診斷工具準確性的統計方法。它已經成功地應用在放射學(radiology)、精神病學(psychiatry)、流行病學(epidemiology)、生物資訊學(biomedical informatics)等領域。一般常用來總結ROC曲線訊息的指標為ROC曲線下面積與ROC曲線下部分面積。ROC曲線下面積為特異度(specificity)全域內的平均可靠度(sensitivity)；ROC曲線下部分面積則是特異度被限制在臨床上有意義的範圍內之平均可靠度。在診斷試驗中，比較新診斷工具與現行標準診斷工具的準確性(accuracy)是一項重要的課題。利用廣義p值(generalized p-value)與廣義信賴區間(generalized confidence interval)的概念，本論文針對成對(paired)ROC曲線下面積和部分面積提出確切的統計推論(exact inferences)。此外，我們亦延伸我們提出的方法來比較兩個擁有多個變數(multiple markers)診斷工具的準確性。透過大規模的統計模擬研究可驗證我們提出的確切檢定能控制型一誤差機率接近宣稱的水準(nominal level)；我們提出的確切區間估計亦能提供足夠的覆蓋機率(coverage probability)。一般而論，我們提出的確切方法優於最大概似估計法(maximum likelihood estimate)以及無母數估計法(nonparametric estimate)。最後，利用我們提出的方法針對幾組胰臟癌(pancreatic cancer)、動脈硬化症(atherosclerosis)、卵巢癌(ovarian cancer)等實際診斷資料進行分析。

關鍵字

廣義檢定變數, 廣義樞紐量, 廣義p值, 廣義信賴區間, 多重標記

並列摘要

The receiver operating characteristic (ROC) curve is currently a popular statistical tool for the accuracy of diagnostic device. It has been widely used in various practical applications, such as radiology, psychiatry, epidemiology, biomedical informatics, etc. One of the primary objectives of diagnostic trials is to compare the diagnostic accuracy of the new diagnostic device to that of the current standard device. The area under the ROC curve (AUROC) is a summary index that is interpreted as the average of true positive rate over entire false positive rates. The partial area under the ROC curve (PAUROC) is another summary index that restricts attention to a specified range of clinical interest. They can be usually used as the bases of inferential statistics for comparing ROC curves. In this dissertation, we develop exact inferences for comparing paired AUROCs and paired PAUROCs based on the concept of generalized p-values and generalized confidence intervals. In addition, we extend the results to compare the paired ROC curves which are constructed by multiple markers. Simulation results demonstrate that the exact test based on generalized p-values adequately controls the size at the nominal level; the exact interval estimation based on the generalized confidence intervals provides not only sufficient coverage probability but also reasonable expected length. In general, the proposed methods outperform some published asymptotic maximum likelihood methods and nonparametric methods in various simulation scenarios. Furthermore, numerical examples using published datasets illustrate the proposed methods.

並列關鍵字

generalized test variable, generalized pivotal quantity, generalized p-value, generalized confidence interval, multiple markers

參考文獻

[1] Baker S. Identifying combinations of cancer markers for further study as triggers of early intervention. Biometrics 2000; 56: 1082-1087.

[2] Baker SG, Pinsky PF. A proposed design and analysis for comparing digital and analog mammography: Special receiver operating characteristic methods for cancer screening. Journal of the American Statistical Association 2001; 96: 421-428.

[3] Bamber DC. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology 1975; 12: 387-415.

[4] Beam CA, Conant EF, Sickles EA, Weinstein SP. Evaluation of proscriptive health care policy implementation in screening mammography. Radiology 2003; 229: 534-540.

[5] Beiden SV, Campbell G, Meier KL, Wagner RF. On the problem of ROC analysis with truth: the EM algorithm and the information matrix. Proceedings of SPIE 2000; 3981: 126-134.

國際替代計量

成對ROC曲線之確切推論

全文下載

主題瀏覽