本論文主要是應用快速傅立葉轉換(FFT)推導出之語者模型梅爾倒頻譜係數(SMMFCC)並透過倒傳遞類神經網路(BPNN)在ARM-based嵌入式系統平台上開發即時語者辨識系統。 由於本論文中所提出辨識系統是要在運算處理能力與記憶體資源有限的嵌入式平台開發,因此所擷取出之語者的特徵參數需要做有效的資料縮減;同時透過主從式架構,將需要大量計算之類神經網路訓練過程交由伺服端負責,用戶端只需要在首次使用或有新使用者加入時遠端透過乙太網路將訓練完成的類神經網路權值(Weight)下載更新儲存,用戶端辨識模組就可以達到即時語者辨識的目的。而在伺服端尚有紀錄使用者登錄情況及聲紋資料庫之功能。 經實驗數據顯示本系統平均正確辨識率可達90%以上,辨識速度可在3秒以內,並可廣泛應用於如居家保全或汽車防盜等需要身份認證場合中。
The main contribution of this thesis is to develop a real-time speaker recognition system with Speaker Model Mel-Frequency Cepstral Coefficients (SMMFCC) derived from Fast Fourier Transform (FFT). Back-Propagation Neural Network is used on ARM-based embedded system platform to perform the speaker recognition function. Due to the limitations of computing capability and memory of embedded systems, the features extracted from speaker model are reduced. In order to overcome the computation limitation, a client - server architecture is proposed in this thesis. In this architecture, the server deals with the Neural Network training process that requires a great deal of computation, while the client performs the real-time speaker recognition based on the updated weights of neural network which is retrieved from the server. The experimental results show that the average recognition rate of this system is more than 90% and the recognition time is less than 3 seconds. The proposed speaker recognition system can be generally applied to home security, office security, factory security systems, etc.