透過您的圖書館登入
IP:3.15.235.196
  • 學位論文

寇斯與隨機漫步統計模式於動態複雜型排序資料:以糞便免疫潛血濃度為例

Cox and Random Walk Statistical Models for Dynamics of Intractable Ordinal Data: An Example of Fecal Hemoglobin Concentration

指導教授 : 陳秀熙
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


背景 糞便潛血濃度(f-Hb) 已證實對於大腸直腸癌的發生率以及死亡率具有極佳的預測力。因此對於在族群篩檢而言,f-Hb 在篩檢時之重複測量數值以及其動態變化對於族群的風險而言亦具有其重要的角色。然而,在運用族群篩檢資料發展描述f-Hb變化的模型時,由於其序位型資料特性以及資料中所包含的相關性、設限以及截切等特性,使得模型的建構極為困難。本研究利用有吸收性境界 (absorbing barrier) 之隨機漫步模型(random walk model) 將上述特性納入考量建構描述族群f-HB動態變化之模式。 目的 本篇論文第一個目的為利用存活分析的模式評估在不同篩檢組別(正常、大腸腺瘤、大腸直腸癌症) f-Hb 的差異表現,並分別估計並得到族群發生大腸腺瘤以及大腸直腸癌症的糞便潛血濃度數值中位數(f-Hb50),以及其不同的臨界值。本片論文第二個主要的目的為應用隨機漫步模型來量化f-Hb 濃度的動態變化,並加以考慮在族群發生大腸腺瘤以及大腸直腸癌症時的最大上界值(即觸及吸收境界)的情況。 方法 我們首先利用傳統的單因子變異數分析以及存活分析針對f-Hb在不同篩檢組別(正常、大腸腺瘤、大腸直腸癌症)平均數或是中位數的差異進行檢定。接著運用寇斯等比例風險模型(Cox proportional hazards regression model)控制可能的影響變項,並且將資料中的相關特性納入考慮,以序位方式對f-Hb數值進行排序,估計各組別(正常、大腸腺瘤、大腸直腸癌症)之對比風險值。配合無母數排序的方法,吾人可以在上述三個組別中計算其糞便潛血濃度數值中位數(f-Hb50),並且分別估計得到族群發生大腸腺瘤以及大腸直腸癌症的f-Hb之臨界值。 在建構動態隨機模型方面,藉由運用隨機漫步模型,並發展基於該模型的漸進分佈(asymptotic distribution) 和多項分佈(multi-nominal distribution) 來描述f-Hb重複測量資料變化的進程,並估計f-Hb在三種不同的疾病狀態下的數值升高機率(p) 以及降低(q)。進一步可以利用估計得到的機率估計值,計算各組別(大腸直腸癌症或大腸線瘤病患)相對應的賭徒破產機率(即觸及吸收境界之機率)。 結果 利用經過自然對數轉換後的f-Hb所作的變異數分析結果中,顯示出三個組別的糞便潛血濃度平均數值達到顯著性的差異 (F=104324, p<0.001, R2=0.142),無母數方法檢定的結果顯示同樣顯著差異 (p<0.001)。 利用寇斯比例風險模型分析在將其他解釋變相納入調整後(性別、年齡、家族病史以及篩檢工具廠牌),以篩檢無疾病的人當作比較組,其結果顯示癌症組的風險比是0.181 (0.178, 0.184),大腸腺瘤組的風險比為0.204 (0.202, 0.205)。此估計結果顯示大腸直腸癌個案以及大腸腺瘤個案具有較高的f-Hb數值,表示在大腸直腸癌篩檢計畫中,檢測出的糞便潛血濃度越高的人,其後續發展成為大腸腺瘤或大腸直腸癌之風險亦較高。 利用隨機漫步模型結合邏輯斯迴歸所估計得到的結果得到f-Hb淨上升機率(drift rate, p-q) 在癌症病患中最高,大腸腺瘤病患次之,最低為無大腸相關疾病的篩檢族群。已僅考慮前進與後退機率的隨機漫步邏輯斯迴歸中為例,在由模型估計的前進機率(p)與後退機率(q)在癌症組中分別為0.733及0.267,在大腸腺瘤組算得的前進與後退機率分別為0.575和0.425,在篩檢後沒有被診斷為大腸疾病的病患的前進機率為0.358,後退機率為0.642;因此f-Hb上升機率在癌症及線瘤組別中皆為大於0的正值,而在正常人則為負值。此外,若與正常族群相較,利用模型與估計結果可以計算癌症族群的在f-Hb之上升勝算比為正常族群的4.92倍;而在大腸腺瘤的族群中,此一勝算比是正常人的2.43倍。利用模型估計結果計算賭徒破產機率時,若對於癌症設定f-Hb 值400 μg/g 為吸收狀態;而大腸腺瘤則以300 μg/g 為吸收狀態;正常篩檢族群的吸收狀態則訂在20 μg/g。計算出來的結果在癌症族群中達到吸收狀態機率為0.867,高於大腸腺瘤組的0.455,而正常組別則是最低的,其吸收機率幾乎為0。當假定每個人的起始濃度(x) 為1時,平均而言,癌症人期望走740步到達400 μg/g,大腸腺瘤組則須走893步到達300 μg/g。對正常族群而言,達到f-Hb濃度0 μg/g 之吸收狀態的期望步數則為7.05步。 結論 本研究運用了寇斯風險比例模式以及建立了隨機漫步迴歸模型以分析具有極端值以及右偏特性的序位資料,模型中亦將由於f-Hb值極低而造成的不可量測(左設限)資料,以及遺失值皆納入模型建構之考量。此外,本研究所建構之模型亦包含了多階段疾病特性。 本研究運用所建構的模型於全國大腸直腸癌症篩檢資料,估計了相較於正常族群下,大腸直腸癌族群以及大腸腺瘤族群之高f-Hb濃度的風險對比值,同時利用族群f-Hb中位數定義各族群之f-Hb臨界值。運用隨機漫步模型架構,本研究藉由對於各族群之 f-Hb 上升與下降之估計值結合其淨上升機率以及到達吸收狀態所需步數之計算釐清f-Hb隨著時間升高或是降低時有多少破產機率(即有多少達到吸收狀態的機率),並且估算走到吸收狀態需要的期望步數。本文中的研究結果所建立的新指標,將有助於發展大腸直腸癌族群篩檢計畫決策以及監測規劃。

並列摘要


Background As fecal hemoglobin concentration (f-Hb) is a good predictor for colorectal cancer (CRC) incidence and mortality, the dynamics of f-Hb is therefore of great interest in the face of large population-based screening data on periodical examination of f-Hb. Modeling the evolution of f-Hb is intractable as it is an ordinal property and often involves with correlated, censoring, truncating, and dynamic movement with absorbing barriers in the province of the random walk model. Aims This thesis was first to assess the values of f-Hb across three groups (normal, adenoma, and CRC), estimate the effective median f-Hb concentration (f-Hb50) and its threshold when the adenoma and CRC were detected. The second aim was to apply the random walk model to quantify the dynamic change of f-Hb considering the upper limit because of occurrence of adenoma and CRC. Methods Conventional survival analysis was employed to test the difference in the mean (or median) value of f-Hb across three groups. The Cox proportional hazards (PH) regression model, making allowance for correlated property, was applied to estimating the hazard ratio (HR) of reaching the ranking of f-Hb across three groups after controlling for relevant covariates. The non-parametric method was used to estimate effective median value of f-Hb (f-Hb50) and the threshold value of f-Hb to hit colorectal adenoma and CRC. To consider the dynamic (stochastic) property, a random walk model with asymptotic distribution and multi-nominal distribution was further developed to elucidate the evolution (repeated measurement) of f-Hb data to estimate the forward probability (p) and backward probability (q) by three types of diseases status. These parameters were also exploited for calculating the gambler’s ruin probabilities of hitting adenoma and CRC. Results The result of ANOVA shows that the differences in the mean value of f-Hb across three groups were statistically significant. The result of Cox PH regression after adjusting for other covariates (gender, age, family history and brand), compared to the normal group, the HR of the CRC group was 0.181 (0.178, 0.184) and the adenoma group was 0.204 (0.202, 0.205), which suggest that screenee who had higher f-Hb may have higher probability to be diagnosed with disease. The estimated results on the random walk logistic regression model is that the drift rate (p-q) was the highest in the CRC patients followed by adenoma, and the lowest in subjects free of colorectal neoplasia. With the random walk logistics regression model merely considering forward (p) and backward probability, the calculation probabilities gave 0.733 and 0.267 for patents diagnosed as CRC, 0.575 and 0.425 of p and q for patients diagnosed as adenoma, and 0.358 and 0.642 of p and q for the normal subjects. Compared with the normal group, the odds ratio of moving forward was 4.923 for CRC and 2.426 for adenoma. If we set 400 μg/g for CRC, 300 μg/g for adenoma and 20 μg/g for normal as the absorbing barrier the gambler’s ruin probability of reaching the barrier was 0.867, which was higher than 0.455 of adenoma whereas the ruin probability for the normal subject was very low. If the initial value (x) was set 1 it takes, on average, 740 steps for CRC, 893 steps for adenoma, and 7.05 steps for normal to reach absorbing barrier. Conclusions The thesis has applied the Cox PH regression model and developed a random walk regression model to accommodate the ordinal data with long tail distribution at extremely high value, undetectable circumstance at extremely low value, and missing values and also in relation to multi-state outcome. These proposed models have been applied to nationwide population-based screening for CRC with FIT to estimate the hazard ratio for CRC and adenoma as opposed to the normal subjects, also to estimate the f-Hb50 and threshold of developing CRC and adenoma, and get a better understanding of how f-Hb moves forward and backward with time, providing what is the chance of having gambler’s ruin (reaching to the barriers of f-Hb) and how many steps are expected to be taken before ruining. These findings provide a new insight into policy-making for colorectal cancer screening and also the surveillance of early-detected colorectal cancer.

參考文獻


Rubin D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Hoboken, NJ, USA. John Wiley & Sons, Inc. DOI: 10.1002/9780470316696
Breslow N. (1974). Covariance analysis of censored survival data. Biometrics. 30(1), 89-99.
Chen L. S., Yen A. M. F., Chiu S. Y. H., Liao C. S., Chen H. H. (2011). Baseline faecal occult blood concentration as a predictor of incident colorectal neoplasia: longitudinal follow-up of a Taiwanese population-based colorectal cancer screening cohort. Lancet Oncology. 12: 551–558. DOI: 10.1016/S1470-2045(11)70101-2.
Chen L. S., Yen A. M. F., Fraser C. G., Chiu S. Y. H., Fann J. C.Y., Wang P. E., Lin S. C. , Liao C. S., Lee Y. C., Chiu H. M., Chen H. H. (2013). Impact of faecal haemoglobin concentration on colorectal cancer mortality and all-cause death. BMJ Open. 3:e003740. DOI: 10.1136/bmjopen-2013-003740.
Chiu H. M., Chen S. L. S., Yen A. M. F., Chiu S. Y. H., Fann J. C.Y., Lee Y. C., Pan S. L., Wu M. S., Liao C. S., Chen H. H., Koong S. L., and Chiou S. T. (2015). Effectiveness of Fecal Immunochemical Testing in Reducing Colorectal Cancer Mortality From the One Million Taiwanese Screening Program. Cancer. DOI: 10.1002/cncr.29462.

延伸閱讀