超解析度醫療影像之卷積神經網路前處理流程的效能優化

病理切片是訂定癌症治療方針的重要依據。要檢查數位病理切片是否有腫瘤，對於醫生來說是一件費時費力的事情。檢查一張影像可能需要好幾分鐘的時間，而且要判斷得準確仰賴醫生多年的臨床經驗。為了解決此問題，我們使用卷積神經網路(Convolutional Neural Network, CNN) 來辨識病理切片，並且使用OpenSlide來讀取醫療影像。然而，由於病理切片影像的解析度非常高，進行CNN判讀之前的讀取與處理影像的過程佔據了大部分的時間(61.7%)，成為效能瓶頸。在此論文中，我們提出了優化讀取醫療影像的方法。首先，我們使用py-spy來分析OpenSlide的效能，並且發現主要的效能瓶頸在於將塊狀圖片做渲染並組合起來成完整影像的運算所使用的cairo程式庫，因此我們重新實作其中組合塊狀圖片的功能，在效能上獲得顯著的改善。其次，我們發現OpenSlide的工作流程中有多餘的色彩轉換，而且在後續的處理中需要額外的資料結構轉換及資料複製，於是重新實作了一個讀圖程式，並使用x86處理機的AVX2向量指令集加速色彩空間轉換。最後，我們將程式以多執行緒平行化，達到可擴展的加速。綜合以上的效能優化，在開啟32個執行緒之下，與原版OpenSlide相比可以達到62.9倍的加速，讀取一張影像所需的時間由80.1秒降為1.27秒，因此完整的病理切片影像判讀的過程獲得2.55倍的加速。

關鍵字

醫療影像前處理；平行運算； OpenSlide ；卷積神經網路；醫療影像辨識

並列摘要

Pathological images are an essential basis for deciding cancer treatment strategy. It is time-consuming and laborious for doctors to check the digital pathological images for tumors. It may take several minutes to diagnose an image, and the accurate diagnosis depends on the doctor's years of clinical experience. Thus, convolutional neural networks (CNN) are introduced to reduce labor and help doctors in checking digital pathological images. However, as the images come with super-high resolution, it takes 61.7% of the entire execution time to read digital slides with OpenSlide in our CNN-based system. To resolve this performance bottleneck, we propose several methods to optimize the reading of digital slides in this thesis. First of all, we use py-spy to profile and analyze OpenSlide and identify the performance bottleneck, which lies in the cairo library for render and assemble small tiles into a complete whole slide image. After we re-implement the function of assembling tiles, the performance has been significantly improved. Secondly, we find that redundant color conversions, data structure conversions, and data copies are performed in the workflow, so we implement an SVS slide reader with improved workflow and accelerate the color space conversion using the AVX2 instruction set provided by the x86 processor. Finally, the entire program is parallelized with multithreading and achieves scalable acceleration. Combining the performance optimizations with 32 threads enabled, we obtain a 62.9x speedup over the original OpenSlide. The average time to read an SVS slide is reduced from 80.1 seconds to 1.27 seconds, and the entire workflow for checking one super-resolution slide is accelerated by 2.55x.

並列關鍵字

medical image preprocessing ； parallel computing ； OpenSlide ； convolutional neural network ； Medical image classification

參考文獻

Hans Pinckaers, Bram van Ginneken, and Geert Litjens. Streaming convolutional neural networks for endtoend learning with multimegapixel images. IEEE Transactions on Pattern Analysis and Machine Intelligence, page 1–1, 2021.

Google Scholar

Adam. Goode, Benjamin. Gilbert, Jan. Harkes, Drazen. Jukic, and Mahadev. Satyanarayanan. OpenSlide: A vendorneutral software foundation for digital pathology, 2013.