透過您的圖書館登入
IP:52.15.147.20
  • 學位論文

統合隨機過程於序位型生物標記資料結合多階段之應用

Meta-stochastic Process for Ordinal Biomarker with Multi-state Outcome

指導教授 : 陳秀熙

摘要


背景 近年來對於運用具有序位特性之生物標記對於早期癌症病灶進行篩檢日趨受到重視,其統計特性對於運用之影響亦值得加以釐清。與此序位資料相關之統計特性包含偏態分佈、不完整資料之設限特性、相依性質、動態變化,以及其對於癌變發生進程之相關性。這些統計特性使得此種資料之分析具有相當之複雜性。前述之特性亦趨使本研究發展對於不同種類的隨機過程之間的連結,我們稱為網絡隨機過程(meta-stochastic process)。運用糞便潛血檢測所得糞便潛血濃度(f-Hb)對於大直腸癌腫瘤病變具有劑量變化趨勢。因此以糞便潛血濃度作為篩檢工具之族群規模大腸直腸癌篩檢即為此種具有序位特性生物標記之應用範例之一。 研究目的 為釐清糞便潛血濃度之百分序位以及其動態變化評估其對於大腸直腸癌由正常到臨床症前期至臨床期之疾病進展過程之影響,本研究運用貝氏方法之隨機網絡過程架構連結包含連續狀態(隨機漫步模型(random walk model) 與擴散過程(diffusion process))以及離散狀態之馬可夫迴歸模型。 方法 本研究首先運用以分位數為基礎之統計方法評估各f-Hb分位對於大腸直腸腫瘤之風險。在此一基礎上,進一步運用貝氏廣義隨機漫步廻歸模型與擴散模型來描述糞便潛血濃度重複測量資料變化的進程,應用此模型於篩檢資料估計糞便潛血濃度在三種不同的疾病狀態(正常、腺腫,以及大腸直腸癌)的數值升高機率(p) 以及降低(q),進一步可以利用估計得到的機率估計值應用賭徒破產機率理論(Gambler’s ruin probability theorm),計算各組別(大腸直腸癌症或大腸線瘤病患)相對應的動態漂移程度,即觸及吸收境界之機率及時間。f-Hb 對於大腸直腸癌不同階段進展(發生臨床症前病灶(initiator)以及發展成為臨床期病灶(promotor))之影響則利用三階段馬可夫隨機模與等比例廻歸模型以評估糞便潛血濃度在各疾病發展階段間之效應。為連結兩個型態隨機過程為隨機網絡模型,本研究發展貝氏有向非循環圖模型,藉由連結糞便潛血濃度動態變化漂移程度中升高(p)以及降低機率(q)的過程以評估在不同疾病發展階段包含大腸癌發生率及大腸直腸癌自無症狀發展至臨床症狀之轉移速率的影響,而升高及降低機率可在隨機漫步模型及擴散過程中服從兩項分佈,而在三階段馬可夫隨機模式中之轉移參數則服從伽瑪分佈,透過貝氏馬可夫蒙地卡羅過程(Bayesian Markov Chain Monte Carlo, MCMC)進行參數估計。 應用 本論文運用所發展之網絡隨機過程於台灣大腸直腸癌兩年一次,以糞便潛血為 工具之族群篩檢資料。資料可分為2004-2009年及2010-2014年兩個時期。就貝氏統計觀點,前期2004-2009年資料可做為先驗資訊,而後期2010-2014年資料可視為概似資料,對所欲估計參數則視為事後分佈進行參數估計與影響評估。 結果 本研究首先評估各大腸直腸癌病變之f-Hb 分位影響。透過模擬之結果顯示羅吉斯函數為廣義隨機漫步廻歸模型最適合之連結函數,各疾病狀態之f-Hb升高(p)以及降低機率(q)之估計結果分別為:大腸直腸癌:p:0.867 (95% CI:0.853-0.880), q: 0.134 (95% CI: 0.120-0.148);非進行性腺腫:p:0.732 (95% CI:0.715-0.750), q: 0.268 (95% CI: 0.250-0.285);進行性腺腫:p:0.797 (95% CI:0.768-0.824), q: 0.203 (95% CI: 0.176-0.232);正常個案:p:0.297(95% CI:0.296-0.298), q: 0.703 (95% CI: 0.702-0.704)。依據此估計結果,正常個案返回0之平均時間(隨機漫步之步數)為2.46(95% CI:2.45-2.47),而對於達到非進行性腺腫之平均步數則為204 (95% CI:199-209),對於達到進行性腺腫之平均步數則為251(95% CI:242-260),達到大腸直腸癌則為462(95% CI: 454-469)。運用2004-2009之資料估計的道之結果低於由2010-2014得到之結果。三階段馬可夫迴歸模型之結果顯示f-Hb 基礎值相較於即時之f-Hb濃度對於大腸直腸癌病變具有顯著之影響,並且具有劑量效應。但對於由PCDP進展至臨床期則影響甚微。 貝氏網路隨機模型之結果顯示f-Hb基礎值以及飄移量(drift)對於大腸直腸癌PCDP之發生皆具有顯著影響,但對於由PCDP發展成為臨床期則為有如此影響。 結論 本研究提出了創新之網絡隨機過程模型,達到連結以隨機漫步模型與擴散模型所描述之f-Hb動態變化以及其對於由連續時間馬可夫過程所描述之多階段疾病進展之目的。本研究運用該創新模型於量化以糞便潛血檢測為工具之大腸直腸癌族群篩檢中,評估f-Hb濃度對於大腸直腸癌早期偵測之影響。

並列摘要


Introduction While an ordinal biological biomarker has gained popularity in the role of the screening method for early detection of cancer its statistical characteristics become the major concern. These include skewness, censoring, correlation, dynamics, and their correlation with the state of carcinogenesis for which statistical analysis become complex and intractable. Such an advance also motivates the links between different types of stochastic process, named as meta-stochastic process (hereafter). One of applications pertains to population-based screening for colorectal cancer with fecal immunological test (FIT) wherein fecal hemoglobin (f-Hb) concentration has been deemed as a dose-response relationship with colorectal neoplasia. Aims To study the property of percentile and the dynamics of f-Hb and its effects on the phenotypes of preclinical detectable phase (PCDP) and clinical phase (CP) with Bayesian quantile-based method and the meta-stochastic model combing continuous-state (random walk model and diffusion process) and discrete-state Markov regression. Methods Bayesian quantile-based method was used to estimate the risk of colorectal neoplasia by percentile of f-Hb. Bayesian generalized random-walk and diffusion regression model was then developed by estimating forward probability (p) and backward probability (q) using appropriate link function with the incorporation of three disease outcomes (normal, adenoma, and CRC) in commensuration with the change of f-Hb from empirical screening data in order to calculate the probability of and the time required for reaching absorbing barrier (threshold of colorectal neoplasia) following Gambler’s ruin probability theorem. The effect of f-Hb on the initiator responsible for the onset of PCDP and the promoter for the transition from PCDP to CP was also modelled by three-state Markov regression model. In order to link two types of stochastic processes as meta-stochastic process, Bayesian directed acyclic graph (DAG) model was developed to quantify respective effects of drift (p-q) on the incidence rate of PCDP and the transition rate from PCDP to CP by linking p and q, following two binomial distributions from the random walk model and diffusion process, with two transition parameters, from the three-state Markov process. Estimation of parameters was pursuant to Bayesian Markov Chain Monte Carlo (MCMC) procedure. Application This meta-stochastic process was applied to data derived from nationwide biennial screening program for colorectal cancer screening with FIT that are divided into two periods, 2004-2009 and 2010-2014. In terms of Bayesian statistical viewpoint, the parameters estimated from 2004 to 2009 were regarded as the prior distributions and the data from 2010 to 2014 were treated as likelihood functions. Both were formed as posterior distributions for estimation of parameters. Results The risk of colorectal neoplasia stratified by the percentile of f-Hb was estimated. After the simulation of identifying logistic link as an appropriate link function of Bayesian generalized random walk model, the forward (p) and backward (q) was 0.867 (95% CI:0.853-0.880), and 0.134 (95% CI: 0.120-0.148) for CRC, 0.797 (95% CI:0.768-0.824), and 0.203 (95% CI: 0.176-0.232) for advanced adenoma, 0.732 (95% CI:0.715-0.750) and 0.268 (95% CI: 0.250-0.285) for nonadvanced adenoma, and 0.297(95% CI:0.296-0.298) and 0.703 (95% CI: 0.702-0.704) for the normal subjects. Time (steps) required increased from 2.46(95% CI:2.45-2.47) for normal state, 204 (95% CI:199-209) for non-advanced adenoma, 251(95% CI:242-260) for advanced adenoma and to 462(95% CI: 454-469) for CRC. Three-state Markov regression model identified baseline rather than updated f-Hb as a significant role of the initiator in a dose-response manner but not in the transition from PCDP to CP. The results of Bayesian meta-stochastic model show both baseline and the drift of f-Hb made significant contribution to the increased incidence of PCDP but not in the transition from PCDP to CP. Conclusions This thesis proposed a novel meta-stochastic model by linking the random-walk model or diffusion process for dealing with dynamics of ordinal biomarker data type with continuous-time Markov process for dealing with multi-state disease progression. The proposed model was successfully applied to f-Hb measured through FIT used for early detection of CRC in population-based organized service screening.

參考文獻


Chen L. S., Yen A. M. F., Chiu S. Y. H., Liao C. S., Chen H. H. (2011). Baseline faecal occult blood concentration as a predictor of incident colorectal neoplasia: longitudinal follow-up of a Taiwanese population-based colorectal cancer screening cohort. Lancet Oncology. 12: 551–558. DOI: 10.1016/S1470-2045(11)70101-2.
Chen L. S., Yen A. M. F., Fraser C. G., Chiu S. Y. H., Fann J. C.Y., Wang P. E., Lin S. C. , Liao C. S., Lee Y. C., Chiu H. M., Chen H. H. (2013). Impact of faecal haemoglobin concentration on colorectal cancer mortality and all-cause death. BMJ Open. 3:e003740. DOI: 10.1136/bmjopen-2013-003740.
Chiu H. M., Chen S. L. S., Yen A. M. F., Chiu S. Y. H., Fann J. C.Y., Lee Y. C., Pan S. L., Wu M. S., Liao C. S., Chen H. H., Koong S. L., and Chiou S. T. (2015). Effectiveness of Fecal Immunochemical Testing in Reducing Colorectal Cancer Mortality From the One Million Taiwanese Screening Program. Cancer. DOI: 10.1002/cncr.29462.
Hopper J. L., Young G. P. (1988). A random walk model for evaluating clinical trials involving serial observations. Statistics in Medicine. 7, 581-590. DOI: 10.1002/sim.4780070505.
Mandel M., Gauthier S. A., Guttmann C. R. G., Weiner H. L., Rebecca. (2007). Estimating Time to Event From Longitudinal Categorical Data: An Analysis of Multiple Sclerosis Progression. Journal of the American Statistical Association. 102(480), 1254–1266. DOI: 10.1198/016214507000000059.

延伸閱讀