透過您的圖書館登入
IP:18.222.190.52
  • 學位論文

使用機器學習方法對 COVID-19 症狀進行先期診斷預測

Early Diagnosis Prediction from COVID-19 Symptoms using Machine Learning Methods

指導教授 : 鄭志宏
共同指導教授 : 謝哲光(Jer-Guang Hsieh)

摘要


開發機器學習算法用於疾病早期診斷的預測模型是人工智能的一個有用應用。這是非常明顯的,因為在學術和醫學領域對該主題進行了多次研究嘗試。為了檢測某種疾病的存在,出現症狀的人需要進行幾次實驗室檢查和檢測試劑才能獲得準確的結果。此外,如果疾病具有傳染性,則需要實時或即時結果,如果疾病爆發,則需要更多的檢測試劑盒供應。本研究提出了一種新方法,通過收集和分析一個人目前正有的症狀來早期診斷疾病,而無需進行實驗室檢查。Kaggle 中可用的“症狀和 COVID 存在”數據集是用於開發此機器學習模型的數據集,它由COVID-19 正樣本和負樣本組成。使用方差閾值、方差膨脹因子、Pearson 相關係數和世界衛生組織的聲明等,對數據集進行數據分析,以確定模型開發中包含的屬性。我們使用合成少數過採樣技術 (SMOTE) 執行數據平衡以防止訓練過程中的樣本偏差。結果表明,機器學習算法隨機森林、支持向量機、k近鄰和多層感知器人工神經網絡的得分最高,分別為98.84%的準確率、100%的敏感性、98.79%的特異性和98.84%的AUC。為了使機器學習模型可供使用,我們對模型進行了序列化並使用它來開發 Web 應用程式。開發的 Web 應用程式將通過提供給用戶的表單收集數據。用戶可以通過單擊或點擊表格中提供的症狀列表來選擇症狀,並且可以單擊預測按鈕開始預測。預測結果將顯示通知,指示用戶可能是 COVID-19 陽性還是陰性,同時還會顯示提醒和後續步驟。應用程式文件已上傳並部署在 Namecheap 網絡託管服務提供商的 cPanel 中。通過這項研究,使用者無需進行實驗室檢查和 COVID-19 快速抗原檢測試劑盒,即可立即進行 COVID-19 傳染病的早期診斷。

並列摘要


Development of prediction models using machine learning algorithms for early diagnosis of a disease is a useful application of artificial intelligence. It is very evident because there were several research attempts for this topic in the academic and medical field. To detect the presence of a certain disease, the person experiencing symptoms need to undergo a couple of laboratory examinations and testing kits to attain an accurate result. Moreover, if the disease is contagious, there is a need of real-time or immediate results and if there is a disease outbreak, more testing kits supplies are needed. This study proposes a new approach early diagnosis of a disease by collecting and analyzing symptoms currently being experienced by a person without the need of laboratory examinations. The “Symptoms and COVID Presence” dataset available in Kaggle was the dataset utilized in developing this machine learning model, it is comprised of COVID-19 positive and negative samples. Data analysis was applied to the dataset using Variance Threshold, Variance Inflation Factor, Pearson Correlation Coefficient and World Health Organization’s statements to determine the attributes to be included in the model development. We performed data balancing to prevent sample bias in the training process using the Synthetic Minority Oversampling Technique (SMOTE). Six machine learning algorithms were utilized to develop machine learning models such as J48 Decision Tree, random forest, support vector machine, k-nearest neighbors, nave bayes algorithms, and multi-layer perceptron artificial neural network. We used the Google Colaboratory research and the python programming language in developing the machine learning model. To obtain the highest performance of the algorithms, hyperparameter optimization and 10-fold cross validation was performed. A comparative analysis was conducted in the results of hyperparameter optimization according to accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve. Results show that machine learning algorithms random forest, support vector machines, k-nearest neighbors and multi-layer perceptron artificial neural network attained the highest scores which are 98.84% accuracy, 100% sensitivity, 98.79% specificity and 98.84% area under the ROC curve. To make the machine learning model available for use, we have serialized the model and used it to develop a web application. The developed web application will collect data through a prediction form provided to the user. The user can select symptoms by clicking or tapping on the list of symptoms provided in the form and may click the predict button to start the prediction. A notification indicating whether the user is possibly COVID-19 positive or negative will be displayed, reminders and next steps were also indicated. The application files were uploaded and deployed in cPanel of Namecheap web hosting provider. Through this study, the early diagnosis of COVID-19 contagious disease can be performed immediately without undergoing laboratory examinations and COVID-19 rapid antigen testing kits.

參考文獻


[1] "Coronavirus," World Health Organizations, 2021. [Online]. Available: https://www.who.int/health-topics/coronavirus (accessed on 23 May 2021).
[2] M. N. Temgoua, F. T. Endomba, J. R. Nkeck, G. U. Kenfack, J. N. Tochie and M. Essouma , "Coronavirus Disease 2019 (COVID-19) as a Multi-Systemic Disease and its Impact in Low- and Middle-Income Countries (LMICs)," SN Compr. Clin. Med., vol. 2, no. 9, p. 1377–1387, 2020.
[3] H. Ames, "How long does coronavirus last in the body, air, and in food?," 15 October 2020. [Online]. Available: https://www.medicalnewstoday.com/articles/how-long-does-coronavirus-last. [Accessed 11 June 2020].
[4] "Coronavirus disease (COVID-19)," World Health Organization, [Online]. Available: https://www.who.int/health-topics/coronavirus#tab=tab_3. [Accessed 24 September 2021].
[5] "COVID Live Update," Worldometers Info, 30 September 2021. [Online]. Available: https://www.worldometers.info/coronavirus/. [Accessed 5 June 2022].

延伸閱讀