乳癌是一種影響全世界女性的常見癌症。全球癌症統計數據表明,早期發現和診斷乳癌可以顯著提高生存機會,而超音波成像因其成本低、解析度高、非侵入式、無放射性等優點,逐漸成為常用診斷工具中主要的乳癌篩查方法。然而,超聲圖像的判讀仍需要廣泛的專業知識,而且主觀評估可能導致診斷差異。 美國放射學會(ACR)提出了乳房影像報告和數據系統(BI-RADS)旨在標準化乳房影像報告,最大限度地減少主觀性並提高一致性。雖然BI-RADS為標準化報告做出了貢獻,但整體標註過程仍然複雜且耗時。 近年來,深度學習技術取得了顯著進步,並擴展到各種類型的電腦視覺任務,包括分割、分類和對象檢測。深度學習技術給許多領域帶來了革命性的變化,因其具有自動特徵提取和優化技術等顯著優勢。憑藉上述優勢,將深度學習技術整合到 CAD 系統中可以提高工作流程效率,有助於更準確、更及時的醫療診斷。在本研究中,我們提出了一個整合的CADx系統,其中包含具有解碼器去噪預訓練的腫瘤分割方法、BI-RADS 病理特徵預測和基於Transformer的分類器模型,以簡化報告過程並增強乳房超聲圖像的診斷結果。 研究主要可分為四個部分。首先,我們引入一種預訓練方法稱為解碼器去噪預訓練,來提高腫瘤分割模型在邊界上的準確性。接下來,通過將分割結果(分割的腫瘤圖像和腫瘤形狀圖像)與原始超聲圖像整合到RGB通道中來生成融合圖像,以提高診斷準確性。第三,使用修改後的Swin Transformer V2並以融合圖像作為輸入來預測BI-RADS病理特徵,並為每個病理特徵設計不同的預測頭。最後,我們提出了一種Integrated PSA Transformer (IPSAT)架構,它將融合圖像和我們預測的BI-RADS病理特徵作為輸入以產生精確的診斷結果。 在這項研究中,我們使用了來自274名患者的 334 個腫瘤的數據集來評估我們提出的方法,其中包括147個良性腫瘤和187個惡性腫瘤。在腫瘤分割中,使用解碼器去噪預訓練的Attention U-Net實現了Dice (0.7424)、IoU (0.8371)和Hausdorff distance 95 (25.7770)的切割效能。接下來,我們使用融合圖像代替原始超音波圖像,並獲得準確度(83.53%)、靈敏度(88.77%)、特異性(76.87%)、陽性預測值(83.00%)、陰性預測值(84.33%)和AUC (0.8679),在準確度、靈敏度和陰性預測值這些指標中分別比使用超音波影像提高了0.6%、3.21%和3.08%。在BI-RADS病理特徵預測中,我們也做到了準確的預測。形狀、方向、邊緣、異質性和後部特徵的準確率分別為 84.13%、79.94%、76.35%、82.63%和73.35%。最後,對於腫瘤診斷,我們通過使用IPSAT並結合預測出來的病理特徵提高了腫瘤診斷的性能,達到準確性(84.13%)、靈敏度(89.30%)、特異性(77.55%)、陽性預測值(83.50%)、陰性預測值(85.07%)和AUC (0.8703)。這些實驗結果表明,我們提出的系統可以簡化註釋過程,並有助於提高乳癌診斷的效率和準確性。
Breast cancer is a common cancer that impacts women worldwide; global cancer statistics have demonstrated that early detection and diagnosis of breast cancer can significantly enhance the chances of survival, and ultrasound (US) imaging has gradually become a primary breast cancer screening method among common diagnostic tools due to its low cost, high resolution, non-invasive, and non-radioactive advantage. Nevertheless, the interpretation of ultrasound images requires extensive expertise; subjective assessments can lead to diagnostic differences (reader dependency). The introduction of the Breast Imaging Reporting and Data System (BI-RADS) by the American College of Radiology (ACR) aimed to standardize breast imaging reporting, minimizing subjectivity and enhancing consistency. While BI-RADS has contributed to standardized reporting, the annotation process remains complex and time-consuming. In recent years, deep learning (DL) techniques have made significant advancements and have expanded to various types of computer vision tasks, including segmentation, classification, and object detection. DL technology has brought revolutionary changes to many fields, with significant advantages such as automatic feature extraction and optimization techniques. With the above advantages, integrating DL techniques into CAD systems can improve workflow efficiency and contribute to more accurate and timely medical diagnoses. In this study, we proposed an integrated CADx system that contained tumor segmentation methods with decoder denoising pretraining, BI-RADS lexicon predictions, and a transformer-based classifier model to simplify the reporting process and enhance diagnostic outcomes in breast ultrasound images. The study can be mainly divided into four parts. First, we introduce a pretraining approach (Decoder Denoising Pretraining (DDeP)) to enhance the accuracy of tumor segmentation models in boundaries. Next, fused images are generated by integrating the segmentation results (segmented tumor image and tumor shape image) with the original US image into RGB channels to enhance diagnostic accuracy. Third, the modified Swin Transformer V2 is employed for predicting BI-RADS lexicons using fused images as input data and adding distinct prediction heads for each lexicon. Lastly, we propose an Integrated PSA Transformer (IPSAT) architecture, which integrates fused images and our predicted BI-RADS lexicons as input to produce precise diagnostic outcomes. In this study, we used a dataset of 334 tumors from 274 patients to evaluate our proposed methods, including 147 benign and 187 malignant tumors. In the tumor segmentation, the Attention U-Net with DDeP achieved the Dice coefficient (0.7424), IoU (0.8371), and Hausdorff distance 95 (25.7770). Next, we use fused images instead of original US images and achieve the accuracy (83.53%), sensitivity (88.77%), specificity (76.87%), positive predictive value (83.00%), negative predictive value (84.33%), and AUC (0.8679), improving the accuracy, sensitivity, and negative predictive value by 0.6%, 3.21%, and 3.08%, respectively. In the BI-RADS lexicons prediction, we achieved accurate prediction. The accuracies for shape, orientation, margins, heterogeneity, and posterior features are 84.13%, 79.94%, 76.35%, 82.63%, and 73.35%, respectively. Lastly, for the tumor diagnosis, we improved the performance of tumor diagnosis by using IPSAT and incorporation with the predicted lexicons, achieving accuracy (84.13%), sensitivity (89.30%), specificity (77.55%), positive predictive value (83.50%), negative predictive value (85.07%), and AUC (0.8703). These experiment results showed that our proposed system can streamline the annotation process and improve efficiency and accuracy in the diagnosis of breast cancer.