透過您的圖書館登入
IP:52.14.142.189
  • 學位論文

基於主訴語料庫進行急診病患住院預測之研究

Using the Corpus of Chief Complaints for Predicting Emergency Hospital Admissions

指導教授 : 吳怡瑾
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


主訴(chief complaint, CC)為描述病患到緊急醫療部門之短文並為病情研判的重要依據,也成為醫令指示的核心內容之一。本研究以北部大型醫院提供之六個年度,共計824,614筆的急診主訴資料,藉由文字處理與文件探勘處理技術以建立五等級檢傷級數(triage)下的語料庫(corpus);研究主要目的為探勘鑑別力強的主訴關鍵字並透過不同triage之語料庫在病患入院進行檢傷判定時即預測住院之可能性,以協助醫院可即早備床。故,本研究涵蓋主訴關鍵字萃取、關鍵字住院預測檢定與主訴語料庫建立。本研究初步結果可觀察到各年度間相同檢傷級數的關鍵字相似度高,表示資料中的主訴用語在各年度及各檢傷級數間有相當的穩定度,故主訴關鍵字應為預測住院之重要屬性,本研究並將基於TFIDF篩選之住院關鍵字進行卡方檢定,建立120個語料庫關鍵字及計算相關係數建立107個與資訊熵37個語料庫關鍵字以提供後續住院預測研究之依據。研究發現使用TFIDF與 採用資訊熵(Entropy)能以較少比例的主訴關鍵字得到較能兼顧住院與非住院的預測,其中又以檢傷級數一的表現最好。研究顯示主訴為進行症狀監測、住院預測等研究之重要資料,以便調配其醫療資源的運用,初步研究結果可提供未來相關醫療臨床決策研究參考。

並列摘要


Chief complaints (CC) are short free-text phrases describing reasons for patients’ emergency department (ED) visits and are important references for medical order. This research adopted CC-related text-based data from January 1, 2010 to December 31, 2015 amassing 824,614 records from the hospital information system of a representative ED in Taiwan. Text processing and data mining techniques were used to construct a CC corpus for five-level ED triage. The aim of this research is to extract keywords of CC to predict if the patients will be inpatients at the time of triage, i.e., the early stage of the ED process, to help hospital prepare beds for patients in time. This research focuses on extracting keywords from CCs, conducting statistical tests for keywords, constructing a triage-based corpus, and then predicting the possibility of emergency hospital admissions. Our preliminary analysis results show that the keywords are quite similar for CCs in each triage across the six-year data collection period. It indicates the contents for CCs are quite stable; therefore, we believe the hospital can adopt the ED CCs to predict if the patients will be inpatients in the early stage of the process. The research carry out the chi-square test for hospitalization based on the keywords of the TFIDF screening to establish 120 CCs corpus keywords and calculate correlation coefficient to establish 107 corpus keywords and information entropy 37 corpus keywords to provide the basis for follow-up hospitalization prediction research. The research found that using TFIDF and using Entropy can compare the hospitalization and non-hospital prediction rates with a small proportion of the CCs keywords, and the best performance is the level 1 of ED triage. The research shows that the CC can be a data source for syndromic surveillance and impatient prediction for the arrangement of medical resources. The results serve as a reference model for related ED research on clinical decision support in a similar context.

並列關鍵字

Chief complaint corpus data mining text processing Triage

參考文獻


一、中文文獻
何敏煌(譯)(2017)。Python 資料科學學習手冊。台北市:碁峯出版社。
吳軍(2014)。數學之美(第二版)。北京市:人民郵電出版社。
林清山(1992)。χ 2 考驗。載於林清山(主編),心理與教育統計學(275-304頁)。台北市:東華書局。
周濟群、連子杰(2011)。運用文字探勘與 XBRL 技術提升企業資訊擷取與整合效益之研究。

延伸閱讀