透過您的圖書館登入
IP:18.119.131.178
  • 學位論文

以機器學習模型發展及驗證短版中風衝擊量表

Using Machine Learning Algorithms to Develop and Validate a Brief Version of Stroke Impact Scale

指導教授 : 謝清麟
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


背景及目的:中風衝擊量表 (Stroke Impact Scale 3.0, SIS) 為著名之生活品質量表。然而,SIS之題數過多(共59題)導致施測時間較長,不利於臨床常態評估。雖SIS已有多種短版測驗可縮短施測時間,但各短版SIS僅能提供單一分數代表個案之整體生活品質,尚無可提供各向度分數之版本,難以瞭解個案於各生活層面之情形。昔日研究發現機器學習可提供優於傳統線性迴歸模型之預測精準度,極具潛力可用於發展短版量表。本研究之目的有二:(1)使用機器學習模型發展短版SIS (Stroke Impact Scale-Brief Machine learning version, SIS-ML);(2)驗證其同時效度及收斂效度。 方法:本研究使用已發表論文之SIS資料進行模擬分析,分為二個階段:(一)發展SIS短版;(二)驗證其同時效度及收斂效度。發展SIS-ML分為5個步驟:(1)挑選短版題目之題組:使用包含套鎖迴歸 (lasso regression) 之人工神經網路 (artificial neural network, ANN) 模型挑選SIS短版之題目組成總數16題至32題共17組SIS-ML題組。題目需包含8個向度,且各向度題目需包含2至4題;(2)訓練ANN模型以預測各向度分數:為挑選預測力最佳之模型,研究者共訓練136種機器學習模型之預測力,包含:17種題數組合(16至32題之版本)、2種隱藏層數(6層及10層),以及4種人工神經元個數(每層8、32、196、512個神經元)之模型。訓練資料以隨機順序並依據35%、35%及30%之人數比例,分為訓練資料、驗證資料及測試資料,以進行模型訓練、驗證及測試模型;(3)以訓練資料挑選預測力較佳之ANN模型架構:挑選於各向度決定係數 (coefficient of determination) 超過0.80之模型架構;(4)以驗證資料挑選預測力較佳之題數組合:自前一步驟之挑選結果中,各向度分數之決定係數超過0.80的題數組合(即SIS-ML總題數);(5)以測試資料選擇最適之機器學習模型:自步驟4之結果中,挑選整體預測力最佳之模型,以做為SIS-ML分數計算之依據。步驟3與4之數據來自驗證資料,步驟5來自測試資料。若該題組有多個模型達標準,則取各組模型分數之平均,以選擇最適之題組及模型。於第二階段,研究者使用第一階段選定最適之模型,匯入測試資料以模擬原版SIS分數,並驗證模擬分數與原版SIS分數、美國國家衛生研究院中風量表 (National Institutes of Health Stroke Scale, NIHSS)、巴氏量表 (Barthel Index, BI) 之相關程度。 結果:本研究使用之前研究收案之256位中風個案資料,並將資料分為訓練資料89人,驗證資料90人及測試資料77人。本研究於第一階段共3種模型架構 (6X196、6X512及10X196) 所衍生模型具有較佳之預測力(步驟3);於此3種架構所衍生之模型中,共6種題數組合(27至32題)之預測力較好(步驟4)。於前述之模型架構中,總題數27搭配6X196之模型架構之平均決定係數最高,故被選為最適之題組及模型架構之配對。本研究第二階段驗證SIS-ML之同時效度及收斂效度,同時效度驗證結果發現SIS-ML之模擬分數與原版SIS有高度相關 (r = 0.92–0.99),此結果顯示SIS-ML與原版SIS有良好之同時效度。收斂效度驗證結果發現SIS-ML各向度之模擬分數與NIHSS呈低度至中度相關 (r = - 0.34 – -0.59),模擬分數與BI達低度至高度相關 (r = 0.22 – 0.74)。 結論:本研究結果支持機器學習模型發展之SIS-ML可將SIS之評估題目數量縮短至45%,第一階段結果顯示最適之短版題數為27題。本研究第二階段亦支持27題之SIS-ML之模擬分數與原版SIS有良好之同時效度,與NIHSS及BI亦與原版相同之收斂效度。然而,本研究所使用之樣本數少,且SIS-ML尚未進行臨床驗證。因此未來研究需使用更大之樣本數確認目前版本及目前模型之預測力及再使用獨立之樣本驗證SIS-ML之信效度。

並列摘要


Background and purposes: The Stroke Impact Scale (SIS) is a commonly used measure of health-related quality of life. However, the SIS has too many items that need much time to administer and place burden on patients. Although previous studies have developed brief versions, the brief versions cannot represent the original domains’ scores. Researchers show that the machine learning can improve the accuracy of prediction. Thus, machine learning algorithms may assist with the development of a brief version of the SIS. The objectives of our study were: (1) using machine learning algorithms to develop a brief version of the SIS (SIS-ML); and (2) validating its concurrent validity and convergent validity in patients with stroke. Methods: Our study used the data collected from a previous study for simulation analysis, and was comprised of 2 phases: (1) development of the SIS-ML; (2) validation of its concurrent validity and convergent validity. Phase 1 contained 5 steps to develop the SIS-ML: (1) choosing item groups of the SIS-ML: Two to four items in each domain were selected to form the SIS-ML using an artificial neural network (ANN) model with lasso regression. (2) training the ANN models to optimize the predictive power in the domain scores. A total of 136 models were trained, which were formed from 17 sets of items (number of items ranged from 16 to 32), 2 sizes of the hidden layers (6 and 10 layers), and 4 neurons in each layer (8, 32, 196, and 512 neurons). The data were separated randomly according to a ratio of 35%, 35%, 30% of the whole sample to become a training set, a validating set, and a testing set, respectively. (3) choosing the model frameworks with better predictive power by the training set. The models that achieved coefficients of determination (R2) exceeding 0.80 were retained. (4) choosing the models with better predictive power by the validating set. The models that were retained in the previous step and achieved individual R2 in each domain > 0.80 were retained. (5) choosing the best model with high predictive power and efficiency by the testing set. The model that used the fewest items to achieve the highest average R2 was selected. The R2 were calculated using the validating set and testing set for step 3 to step 5, respectively. In Phase 2, Pearson’s correlation coefficient (r) was used to validate the concurrent validity between the SIS-ML, which was chosen in Phase 1, and the original SIS, and the convergent validity between the SIS-ML and the National Institute of Health Stroke Scale (NIHSS) and Barthel Index (BI). Results: In Phase 1, 17 item groups were chosen, resulting in 136 groups of model scores (steps 1 and 2). 3 models were chosen (6X196, 6X512, and 10X196) as the better model frameworks (step 3). 6 sets of items of the SIS-ML had better predictive power (step 4). 27 to 32 items were considered as the acceptable numbers of items. Finally, the best number of items was 27 and the best model framework was 6X196 (step 5). In Phase 2, the SIS-ML, which was chosen in Phase 1, had high correlation (r = 0.92–0.99) with the SIS. The SIS-ML had fair to medium correlation (r = - 0.34– -0.59) in the NIHSS and had poor to high correlation (r = 0.22 – 0.74) with the BI. Conclusion: The SIS-ML contain less than half of the items of the original SIS. The 27 items with the 6X196 model framework was the best version of the SIS-ML. The SIS-ML had good concurrent validity with the SIS and the convergent validity was similar to that of the SIS.

參考文獻


台灣版世界衛生組織生活品質問卷發展小組. (2000). 台灣版世界衛生組織生活品質問卷之發展簡介. 中華公共衛生雜誌, 19, 315-324.
姚開屏. (2002). 健康相關生活品質概念與測量原理之簡介. 台灣醫學, 6, 183-192.
梁佩蓉, 林佩欣. (2016). 中風衝擊量表用於台灣中風患者之信效度測量. 物理治療, 41, 28-36.
Alhamzawi, R., Ali, H. T. M. (2018). The Bayesian adaptive lasso regression. Mathematical Biosciences, 303, 75-82.
Atkinson, G., Nevill, A. M. (1998). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports medicine, 26, 217-238.

延伸閱讀