透過您的圖書館登入
IP:18.217.252.137
  • 學位論文

基於深度學習之鯨豚哨叫聲自動辨識

Automated Detection of Cetacean Whistles Based on Deep Learning

指導教授 : 丁肇隆
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來,台灣政府對於離岸風場及周邊的台灣白海豚棲息地的保護策略逐漸引起社會的關注,因此鯨豚觀測研究的重要性相對提升。過往,精確辨識鯨豚哨叫聲需要以人工方式進行,而此過程極度耗時且效率通常不如人意。雖然目前已經出現某些基於影像處理的電腦視覺方法,但這些方法在環境背景音訊變化大的情況下,無法有效調整辨識閾值,因而限制了其泛用性,且嚴重影響其在不同環境下的辨識能力。為了進行有效觀測,本研究嘗試以深度學習訓練物件偵測模型於經過短時傅立葉轉換(Short-Time Fourier Transform, STFT)產生的頻譜圖中自動識別鯨豚的哨叫聲。方法的核心是使用以YOLO (You Only Look Once) 為基礎的影像物件偵測模型,藉由組合多種前處理技術,包括影像處理和哨叫聲的特徵提取,來確定鯨豚的哨叫聲的發生時間和頻率範圍。本研究利用了多個水下麥克風錄製的音訊資料,對類神經網路模型進行訓練,以其偵測音訊頻譜圖中的哨叫聲,並與其他偵測方法如 NTU_PAM 和 PAMGuard 進行比較。與以往偵測方法相比,本研究提出的方法不需設定固定的閾值如 SNR 閾值、頻譜能量閾值、頻寬閾值和持續時長閾值,即可在不同噪音大小的環境中進行偵測,並且不只在高 SNR (Signal-to-Noise Ratio)值的環境中具有極高的召回率,在中低SNR值的環境依舊能有穩定的偵測表現。此外,本研究以實驗確定最佳之偵測模型參數量,進而實現最佳的模型效果,並且以多通道將不同頻譜圖資訊融合,輸入偵測模型,實驗也確認此資料前處理方式,能有效提升模型的偵測性能。除此之外,本研究亦以多種窗口大小解析度之頻譜圖影像擴增資料的多樣性,並驗證此擴增方法能顯著地提高模型的偵測性能。據此,本研究將可以對鯨豚遷徙觀測及族群變化的監控,提供重要的技術支援。

並列摘要


In recent years, Taiwanese government strategies for protecting the habitats of Taiwanese white dolphins near offshore wind farms have gained societal attention, emphasizing the importance of cetacean observation studies. Traditionally, cetacean whistle identification required manual effort, which is time-consuming and inefficient. Existing computer vision methods struggle with threshold adjustments in varying noise environments. This study employs deep learning to train a YOLO-based object detection model for automatically identifying cetacean whistle sounds in spectrograms. Combining image processing and whistle feature extraction, the model determines the occurrence time and frequency range of whistles and uses TDOA for sound source localization. This study uses audio data from multiple underwater microphones to train a neural network model and compares it with NTU_PAM and PAMGuard. Unlike previous methods, our approach does not require fixed thresholds (e.g., SNR, spectral energy, bandwidth, duration) and achieves high recall rates in high SNR environments while maintaining stable performance in medium and low SNR settings. Extensive experiments were conducted to optimize model parameters. A multi-channel data fusion preprocessing method and a dataset augmentation technique using multiple window sizes were developed and validated, significantly enhancing detection performance. This study provides crucial technical support for monitoring cetacean migration and population changes.

參考文獻


1. Department of Information Services, Executive Yuan (Ed.). (2019). Four-Year Wind Power Promotion Plan. Department of Information Services, Executive Yuan.
2. Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLO (Version 8.0.0) [Computer software]. Retrieved from https://github.com/ultralytics/ultralytics
3. Oswald, J. N., Rankin, S., & Barlow, J. (2008). To whistle or not to whistle? Geographic variation in the whistling behavior of small odontocetes. Aquatic Mammals, 34, 288-302. [CrossRef]
4. Hung, C.-T., Chu, W.-Y., Li, W.-L., Huang, Y.-H., Hu, W.-C., & Chen, C.-F. (2021). A case study of whistle detection and localization for humpback dolphins in Taiwan. Journal of Marine Science and Engineering, 9(7), 725. https://doi.org/10.3390/jmse9070725
5. 周蓮香, 林幸助, 孫建平. (2017). 中華白海豚族群生態與河口棲地監測. 行 政院農業委員會林務局.

延伸閱讀