含無事件機率之事件發生時間不完整資料的統計推論

存活分析通常關注的是事件發生時間(time-to-event) 的資料, 例如死亡時間、疾病復發時間或發病年齡。一般而言, 事件發生時間的資料通常沒有辦法被完整觀察到。各種不同的研究設計與資料抽樣方案有可能會導致設限資料(censored data) 或截斷資料(truncated data) 產生。在一個長期追蹤研究(longitudinal follow-up study) 中, 我們時常會收集到區間設限資料(interval censored data)。並且, 在一個健康世代的長期追蹤研究中, 我們也時常遇到左截切區間設限資料 (left truncated and interval censored data)。在傳統的存活分析中, 一個基本的假設是: 針對有興趣的疾病, 所有研究對象最後都會發病並且治療以後都會在復發(Cox and Oakes, 1984; Kalbfleisch and Prentice, 2002)。然而, 實際上有於各種不同的遺傳基因和環境因素, 可能會讓某些研究對象對於我們所感興趣的疾病並不會發病。另外, 由於現今醫學診斷技術及策略的進步, 很多之前無法被適當治療的病人, 現今都能夠被適當的診斷及治癒。因此, 統計方法在事件史分析(event history analysis) 的應用上, 已經考慮加入無涉險比率(event-free fraction), 例如非易受感染性(nonsusceptibility) 機率或治癒機率 (Miller, 1981)。近來,考慮使用混合存活分布(mixture survival distribution)的有母數(parametric)及半母數迴歸模型(semiparametric regression model), 已被大量應用在右設限資料(right censored data) 的研究上(Farewell, 1982, 1986; Kuk and Chen, 1992; Yamaguchi, 1992; Peng et al., 1998; Peng and Dear, 2000; Sy and Taylor, 2000; Li and Taylor, 2002; Lu and Ying, 2004)。針對左截切區間設限資料, 在考慮加入無涉險率因子的情況下, Chen et al. (2013) 提出一個包含非易受感受性因子的邏輯斯-加速失敗混和迴歸模型(logistic-AFT location-scale mixture regression model)來處理這類型的資料。然而, 就我們所知, 針對左截切區間設限資料, 文獻上並未有無母數估計方法同時考慮無涉險率和事件發生時間分布的研究。另外,也很少有文獻針對含有無涉險率之右設限資料,提出雙樣本的無母數等級檢定統計量(two-sample rank test statistics)。因此, 我們在同時考慮無涉險率因子和事件發生時間分布的情況下, 針對此兩類資料分別提出估計及檢定的方法: (1) 第2章, 針對左截切區間設限資料, 提出單樣本無母數估計(one-sample nonparametric estimation), (2) 第3章, 針對右設限資料, 提出雙樣本無母數等級檢定(two-sample rank test)。此外, 在生醫研究上, 我們通常會利用迴歸模型來估算共變數(covariate) 的效用。所以, 為了方便使用Chen et al. (2013) 這篇文獻的方法來做資料分析, 我們在第4章, 以此文獻提出的方法為基礎, 開發了一個網頁式友善介面的統計軟體系統, 並稱稱做『EHA-RiskFree』。

關鍵字

治癒率； EM 演算法；區間受限；左截切；非易受感受性；等級檢定；自我一致估計值；廣義Wilcoxon 檢定；網頁式友善介面

並列摘要

Survival analysis is concerned with time-to-event data, such as time to death, time to relapse of a disease, and age at onset of a disorder. Typically, a set of time-to-event data can not be completely observed. Arising from various schemes of study design and data sampling, it may produce censored and/or truncated data. In a longitudinal follow-up study, general interval censored data are often collected. Moreover, in a longitudinal follow-up study of a healthy cohort, left truncated and interval censored (LTIC) data are frequently encountered. In traditional survival analysis, an underlying assumption is that all the study subjects are susceptible to contracting or relapsing into the disease of interest (Cox and Oakes, 1984; Kalbfleisch and Prentice, 2002). However, owing to various genetic and environmental etiologies, some study subjects may not be susceptible to the disease of interest. Moreover, due to recent progress in medical diagnostic technology strategy, many patients who could not previously be adequately treated can now be appropriately diagnosed and cured. Hence, statisticalmethods in event history analysis have considered incorporating event-free fractions such as probabilities of nonsusceptibility or cure (Miller, 1981). Recently, parametric and semiparametric regression models with the mixture survival distribution have been extensively studied for right censored data (Farewell, 1982, 1986; Kuk and Chen, 1992; Yamaguchi, 1992; Peng et al., 1998; Peng and Dear, 2000; Sy and Taylor, 2000; Li and Taylor, 2002; Lu and Ying, 2004). For LTIC data in considering event-free fraction, Chen et al. (2013) recently proposed logistic-AFT location-scale mixture regression models with nonsusceptibility for left-truncated and general interval-censored data. To the best of our knowledge, however, no nonparametric estimation has been discussed in the literature which considers both the event-free fraction and event time distribution simultaneously for LTIC data, and very few two-sample rank test statistics has been proposed for right censored data with event-free fraction. Therefore, incorporating the event-free fraction(s) with the event time distribution(s) simultaneously, we develop (i) a one-sample nonparametric estimation for LTIC data in Chapter 2 and (ii) two-sample rank tests for right censored data in Chapter 3, respectively. Besides, effects of covariates are also important for biomedical studies, and usually assessed by regression models. Therefore, to facilitate the analysis procedures, we have developed the statistical software system “EHA-RiskFree” in Chapter 4 on the methodological foundation of Chen et al. (2013) with a web-based user-friendly interface.

並列關鍵字

Cure fraction ； EM algorithm ； Interval censoring ； Left truncation ； Nonsusceptibility ； Rank test ； Self-consistency estimator ； Generalized Wilcoxson tests ； Web-based user-friendly interface

參考文獻

Boag, J. W. (1949). Maximum likelihood estimates of the proportion of patients cured by cancer

Breslow, N. (1970). A generalized Kruskal-Wallis test for comparing K samples subject to unequal

regression models with nonsusceptibility for left-truncated and general interval-censored data.

Statistics in Medicine, Early View, DOI: 10.1002/sim.5845.

Cox, D. R. (1972). Regression models and life tables (with discussion) Joumal of the Royal

國際替代計量

含無事件機率之事件發生時間不完整資料的統計推論

全文下載

主題瀏覽