透過您的圖書館登入
IP:18.118.193.232
  • 學位論文

使用性別資訊於語者驗證系統之研究與實作

A study and implementation on Speaker Verification System 
using Gender Information

指導教授 : 張智星
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在語者驗證領域中,在不改變聲學模型架構之前提下,以男性與女性之語料分別訓練的性別相關模型取代性別不相關模型,是常見的提升系統辨識率作法之一。然而,在實際運用情形中,由於測試語者的性別是未知的,因此性別分類器在此流程下便扮演了非常重要的角色,其準確度更會直接影響語者驗證系統的表現;而確保系統面對不同性別之仿冒者皆能正確拒絕,亦是此作法相當重要的一項訴求。 為探討不同的「語者性別資訊運用方法」對於語者驗證系統所產生的影響,本論文實作了以 i-向量與機率性線性判別分析模型為語者特徵與評分器之語者驗證系統,與 3 種以 i-向量為基礎的性別分類器。本論文在分析一般使用性別相關模型之語者驗證系統的弱點後,分別於「性別分類器表現良好」與「性別分類器表現不良」之兩大狀況下提出其他若干種不同的性別資訊應用方法,並分析各方法在不同的仿冒者性別組成下之表現,最後亦達成了在各種情況下皆能讓系統表現超越傳統作法之目標。

並列摘要


For speaker verification task, one way to improve system’s accuracy without changing the algorithm of acoustic model is to use gender-dependent model instead of gender-independent one. However, since test speakers’ gender are not available, gender classifier plays an important role since its accuracy directly affects the performance of the whole speaker verification system; furthermore, ensuring that the system can maintain good performance under different gender composition of test speakers is also an important appeal. To explore the impact of different gender information’s usage on speaker verification system, this paper implemented a speaker verification system using i-vector and PLDA model as speaker feature and scoring model respectively, and 3 i-vector-based gender classifier. After analyzing the weakness of speaker verification system using gender-dependent model in a general way, we proposed several different methods for the application of gender information under the conditions when gender classifier has good and poor performance respectively; moreover, we analysis the performance of each method under different gender composition of test speakers as well. Finally, we reached the goal of making our system achieve better performance than tradition practice under different circumstances.

參考文獻


[1] http://speech.ee.ntu.edu.tw/DSP2017Autumn/
[2] https://en.wikipedia.org/wiki/Window_function
[3] Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. "Maximum likelihood 
 from incomplete data via the EM algorithm." Journal of the royal statistical 
 society. Series B (methodological) (1977): 1-38.
[4] Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. "Speaker 
 verification using adapted Gaussian mixture models." Digital signal 
 processing 10.1-3 (2000): 19-41.
[5] Gauvain, J-L., and Chin-Hui Lee. "Maximum a posteriori estimation for 
 multivariate Gaussian mixture observations of Markov chains." IEEE 
 transactions on speech and audio processing 2.2 (1994): 291-298.

延伸閱讀