  • 學位論文


Methodology for the Time-Dependent AUC and Its Applications

指導教授 : 江金倉


ROC (receiver operating characteristic) 曲線分析已經被廣泛地應用於早期疾病診斷之評估。其中AUC (area under the ROC curve) 以及PAUC (partial area under the ROC curve) 是兩個最常使用的指標。近來在實務應用上,隨時間變化之疾病狀態 (disease status) 的資料已經越來越常見。不同於以往單純的二分類 (有病或沒病),時間相關之疾病狀態是由一個特定事件的發生時間所定義出來的。給定任一個時間點,資料被分為在此時間點之前有發生疾病以及沒有發生疾病的兩個群體。有鑒於此,傳統的ROC曲線分析也必須推廣到時間相關ROC曲線分析。由於事件發生的時間有可能被截切 (censored) 以至於無法知道確切的發生時間,在使用時間相關ROC曲線分析時,最大的挑戰來自於如何利用不完整的資料進行統計推論。另一個常遇到的問題是使用單一生物指標來分類疾病狀態往往無法達到預期的水準。在很多情況下,受試者會被同時觀測到多重生物指標。如何結合多重生物指標以增進分類的能力也是一個很重要的議題。 不同於現有的方法,我針對時間相關之AUC以及PAUC提出一系列的無母數估計方法。由於這些估計方法有明確的表達式,因此不論是對計算效率還是大樣本理論之推導都有很重要的貢獻。此外,我也提出了廣義線性模型以分析時間相關之AUC與其它變數之間的相關性。最後,在承認條件存活分布函數滿足EGLM (extended generalized linear model) 並且不需指定連結函數確切形式的情況下,我在這本論文裡對於最佳合成指標也提出了兩個無母數估計方法。所有建立的推論方法都應用在AIDS Clinical Trials Group (ACTG) 175以及Angiography Coronary Artery Disease (CAD) 這兩筆臨床資料以驗證其實用性。


AUC 分類 疾病狀態 最佳合成指標 預測 ROC 存活時間


To evaluate the performance of test results in early detection of disease, the receiver operating characteristic (ROC) curves are widely used. The area under the ROC curve (AUC) and the partial area under the ROC curve (PAUC) are the most popular summary measures for its generality and ease of probability interpretation. In applications, data with the binary time-varying disease status are frequently encountered. The cases and controls in the ROC analysis are more suitable defined over time. A major challenge in dealing this issue is that the failure status of some individuals might not be available due to censoring. To further increase classification ability of multiple biomarkers, research interests usually focus on seeking combinations of these biomarkers with the highest ROC curve. In contrast to the existing methods, we propose nonparametric estimators for the time-dependent AUC and PAUC with explicit expressions and a rigorous theoretical development for these methods. Moreover, we use a generalized linear model with time-varying coefficients to characterize the time-dependent AUC as a function of covariate values. For the parameter functions and the related classification accuracies, the estimation and inference procedures are also proposed. Under the validity of an extended generalized linear model (EGLM) with time-varying coefficients and an unknown link function for the conditional survival distribution, two nonparametric procedures are proposed to estimate the optimal composite markers based on the estimation procedures of the time-dependent AUC. Two empirical examples from the AIDS Clinical Trials Group (ACTG) 175 study and the Angiography Coronary Artery Disease (CAD) study are used to illustrate the usefulness of our methods. Finally, some concluding remarks and further research topics of interests are devoted in this thesis.


Chiang, C. T., Wang, S. H., and Hung, H. (2008). Random weighting
failure time outcome. extit{Journal of the American Statistical
Horvitz, D. and Thompson, D. J. (1952). A generalization of sampling
distribution under random censoring. extit{Annals of Statistics},
Beran, R. (1981). Nonparametric regression with randomly censored
