  • 學位論文


Initial Studies and Analysis of Birdsong Recognition

指導教授 : 張智星


本論文目標在研究適合從鳥鳴辨識種類之辨識系統,實驗設計大致分為單一種類鳥鳴辨識與混合種類鳥鳴的辨識。辨識模型為GMM與HMM,特徵包含MFCCs、能量、音高、共振峰、音訊週期性與非週期性可能性估計等,並根據經驗設計HMM各種鳥鳴模型之狀態數,以提升辨識結果。 本實驗用以判定辨識結果的指標有三項:辨識率(recognition rate)、正確率(hit rate)和準確率(accuracy),使用的訓練與測試語料是收集市面已出版的鳥鳴圖鑑,共28種鳥之鳴聲。鳥鳴語料訓練辨識之前,我們預先用人工標記音檔各段屬性,作為標準答案用以比對辨識結果。 實驗結果單一種類鳥鳴由最初的GMM系統辨識率72%到最後使用HMM改良後的辨識系統提升了12個百分點達84.38%;混合種類鳥鳴辨識之正確率也達78%。




The objective of this study is to investigate the proper settings and features for achieving robust performance of birdsong recognition system based on GMM and HMM models. The features include MFCCs, energy, pitch, formant, voicing degree and aperiodicity. We design the state numbers of HMM system for each bird model by the experience of observing birdsong. There are three rules to measure the recognition result, recognition rate, hit rate and accuracy. We collected birdsong recordings on sale for training and test data. There are 28 species of bird. Approximately all recordings can be separated into 2 groups. One is single species birdsong and the other is mixed species birdsong recordings. We labeled the attributes of each part of the sounds as the answer to the recognition.  In the part of the recognition of one species in one sound, the recognition rate promoted 12% from the GMM system to the improved HMM system at 84.38%. On the other hand, the recognition of mixed species birdsongs, the hit rate is at 78%.


Birdsong recognition


【2】Douglas A. Reynolds and Richard C. Rose, “Robust text-independent speaker identification using gaussian mixture speaker models”, IEEE Transactions on speech and audio processing, Vol. 3, No. 1, pp.72-83, Jan. 1995.
【6】Seppo Fagerlund, “Acoustics and physical models of bird sounds”.
【7】Seppo Fagerlund, “Automatic Recognition of Bird Species by Their Sounds”.
【10】Sven E. Anderson, Amish S. Dave, and Daniel Margoliash, “Template-based automatic recognition of birdsong syllables from continuous recordings”, Acoustic Society America 1996.
【1】Aki Härmä, “Automatic identification of bird species based on sinusoidal modeling of syllables”, Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on.


