透過您的圖書館登入
IP:3.145.105.105
  • 學位論文

基於數字文本相關之語者驗證的研究與實作

Study and Implementation on Digit-related Speaker Verification

指導教授 : 張智星
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


聲紋驗證為生物辨識中一種重要的驗證方式,此種驗證方式最大的優點即是硬體需求簡單,只需要一般市面上常見的麥克風即可,因此常用於電話及手機的生物辨識。本篇論文目標為建立一套文本相關的聲紋驗證系統並包含三個部分:「動態時間扭曲語者驗證系統」利用強制對齊切開數字後藉由動態時間扭曲比較註冊時數字的梅爾倒頻譜係數與測試時數字的梅爾倒頻譜係數之差異、「語句級語者驗證系統」直接抽取註冊音檔與測試音檔的i-vector並使用餘弦相似度或機率線性判別分析來評分這二組i-vector、「數字級語者驗證系統」利用強制對齊切開數字後抽取註冊音檔與測試音檔中各個數字的i-vector並使用餘弦相似度或機率線性判別分析來評分對應數字的i-vector。

並列摘要


Speaker recognition is an important biometric identification method. The biggest advantage of using such method is the simple requirement of its hardware, which only consists of a microphone. Therefore, it is widely implemented in mobile phones and call centers. The purpose of this thesis is to create a text-related speaker verification system, for which we conduct three different approaches to analyze their result: dynamic time warping compares the differences between the MFCCs for digits at registration and digits at testing after applying forced alignment; sentence-level uses cosine similarity or PLDA to rate the two groups of i-vector retrieved from the audios at registration and testing respectively; digit-level uses cosine similarity or PLDA to rate each i-vector of every digits in the audios after applying forced alignment.

參考文獻


[1] Rabiner, Lawrence R., and Biing-Hwang Juang. Fundamentals of speech recognition. Vol. 14. Englewood Cliffs: PTR Prentice Hall, 1993.
[2] Lin-Shan Lee. "Fundamentals of Speech Signal Processing 2017 Spring", available at "http://speech.ee.ntu.edu.tw/DSP2017Spring/", accessed on [June 2018].
[3] "Frames Representation of Speech Signal", available at "https://basic-programming.blogspot.com/2005/11/frames-representation-of-speech-signal.html", accessed on [June 2018].
[4] "window function", available at "https://zh.wikipedia.org/wiki/%E7%AA%97%E5%87%BD%E6%95%B0", accessed on [June 2018].
[5] Prasad, N. Vishnu, and Srinivasan Umesh. "Improved cepstral mean and variance normalization using Bayesian framework." Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on. IEEE, 2013.

延伸閱讀