透過您的圖書館登入
IP:18.116.28.246
  • 學位論文

基於人工智慧分析玻尿酸體積於注射式喉成型手術後的降解情形

Artificial Intelligence-Based Ultrasonic Image Analysis for Estimating and Tracking the Degradation of Injection Laryngoplasty

指導教授 : 李明穗

摘要


隨著社會進步,人們之間的溝通越顯重要,人使用聲音的頻率大為增加,過度的使用聲帶導致聲帶萎縮和老化的好發年齡逐漸降低。聲帶萎縮的情況下,發聲時聲帶不易閉合,病患為了正常發聲則必續更用力的使用聲帶,在如此的惡性循環下病況變得越來越嚴重,造成病患日常生活上的問題。門診聲帶注射術發展多年已逐漸成熟,被廣泛的應用在大多數的聲帶問題上。要治療聲帶麻痺或聲帶萎縮的情況下便會通過聲帶玻尿酸注射來增加聲帶體積,幫助聲帶的閉合。因為要觀察玻尿酸在聲帶的目前狀況,因此醫生會使用超音波影像和病患發聲訊號來觀察,透過超音波影像我們可以明確觀察到玻尿酸在喉嚨中體積變化,有異狀即可採取立即行動處理。在這個任務中,我們希望利用超聲喉嚨圖像中玻尿酸的分割結果來計算玻尿酸的體積,可以用來跟踪和估計玻尿酸體積的變化趨勢,幫助醫生進行臨床判斷。 在醫學影像分割中,由前人的成果我們可以知道深度學習取得前所未得的優良效果。此任務中跟一般傳統醫學影像分割有所區別在於,此圖片前後的圖像是連續的跟前後彼此的幀是高度相關的。因此若使用一般2D的卷積神經網路,再把單張結果組合在一起可能會缺乏前後時間的資訊。但若使用3D卷積神經網路所花費的資源是相巨大觀的(其中包含大量記憶體及訓練時間)。在本文中我們提出了2D卷積神經網路結合時間循環神經網路達到可以同時致力於單張圖像的結果又可以考慮前後的時間資訊,在硬體上消耗的資源也是遠低於3D卷積神經網路的。 我們也進一步將方法應用在我們的資料集中,此資料集是病患喉嚨中的超音波影像在接受注入玻尿酸後的不同時間點。醫生藉由玻尿酸在聲帶中的體積降解情形,可以追蹤病人聲帶恢復情形,若我們能精準的預測出玻尿酸體積則進一步幫助醫生在臨床上的判斷。藉由我們的實驗,我們的架構有最好的表現相較於其他的架構。

並列摘要


With the economic development and social progress, communication has become more important. That is, people use their voice much more often. It is worth mentioning that the onset age of vocal cord atrophy, which is due to the overuse of vocal cord, has tend to be younger. Vocal cord atrophy is the thinning of one or both vocal muscles. Since one vocal fold cannot meet the other one, patients have to take extreme effort to force vocal cords to close well during voicing. This process would turn into a vicious circle, making the condition even worse and causing problems in daily life. Thankfully, a treatment called injection augmentation has been developed into an effective practice over the years and be widely applied to many vocal cord disorders. In most cases, in order to treat vocal cord atrophy or vocal cord paralysis, the vocal cord would be injected with Hyaluronic Acid (HA) to improve the glottal gaps and help vocal cord close properly. To observe how HA works at vocal cord, we utilize ultrasound images and voicing signals. Through the ultrasound image sequences, we could clearly see the volume changes of HA in the throat and take immediate actions if needed. In this task, we use the segmentation results of HA in ultrasonic throat images to calculate the HA volume, which can be used to track and estimate the trend of changes in the HA volume and help doctors make clinical judgments. In the medical image segmentation, we confirm that the deep learning method achieved the state of the art results. However, this task is not the same as traditional image segmentation. The images are successive frames, and they have high extremely high correlation. Therefore, if we apply 2D convolutional neural network on a single image and concatenate them, the results may be lack of the temporal information. Besides, if we simply replace 2D convolution with 3D convolution, incorporating 3D convolution incur extremely high computation costs(e.g., high memory consumption and long training time). In this study, we introduced RNN-based image sequence segmentation model, which 2D convolution extract contexts from a 2D image and RNN exploit 3D contexts. In addition, this model has extremely low computation costs. We also apply this method on our proposed dataset, which covering patients throat ultrasound image sequences after the injection of HA. The doctor observed the degradation of HA at throat and tracked the vocal cord recovery. If we accurately predict the HA volume, it can help the doctor clinical judgment. According to our experiments, our architecture yielding significant performance over other ones.

參考文獻


[1] W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy. Dataset of breast ultrasound images. In Data in Brief, 2020.
[2] L.C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision, 2018.
[3] B. Cheng, R. Girshick, P. Doll´ar, A. C. Berg, and A. Kirillov. Boundary iou: Improving object centric image segmentation evaluation. In Conference on Computer Vision and Pattern Recognition, 2021.
[4] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition, 2016.
[5] S. Hochreiter and J. Schmidhuber. Long shortterm memory. In Neural computation, 1997.

延伸閱讀