多媒體應用之語音辨識系統

隨著電子多媒體系統的迅速發展，使得多媒體服務有無限可能。其中藍芽系統已成為無線通訊技術發展的新領域，這代表著所有的應用將可透過藍芽技術整合功能，而能夠讓使用者更便利的利用這項服務，關鍵詞萃取語音辨識系統就成了重要的方式之ㄧ。在本論文中，我們首先將針對語音辨識發展理念規劃一套多媒體應用語音辨識系統，模擬使用者使用多媒體系統的情況。所提出的服務則基於駕駛者在車內最常使用的操控模式，包括聽音樂、打電話及導航系統等等，透過問答方式的人機互動介面讓操作者感到友善，且本系統中將採用語音合成來模擬人聲以作為回應。我們以關鍵詞萃取為主的辨識技術可提升系統的移植性與擴展性，而階層式架構設計可於各種環境下增加語音辨識的可靠度。然而環境噪音以及雜音干擾，我們將進行強健性語音辨識，利用強建語音參數及模型調適等方面的技術來降低測試環境的影響。最後，我們再對系統進一步增建個人化使用的設計，藉由語者辨識技術提供專屬的服務，且再運用語者模型調適技術來強化系統的辨識效能。

關鍵字

語音辨識系統

並列摘要

Vehicle electronic multimedia system with the rapid development of the car makes the services provide immense possibilities. In which, the Bluetooth wireless technology has become a new area, and then all the applications will be integrated through this technology. However, the crucial role to play in that is speech recognition. In this thesis, we develop a speech recognition system of multimedia applications in car environment to mimic the using of multimedia for the driver and passengers. Our service is based on the most common use of control modes, including listening to music, phone and navigation systems, and so on. The user-friendly interface will be made through the interactive question-and-answer approach. Speech synthesis is adopted in our system to simulate human voices as response. Keyword spotting-based recognition system can improve the portability and system scalability and the design of hierarchical structure can increase speech recognition reliability in car environment. However, the vehicle noise and interference from vehicle environment is still a challenge, so we carry out the robustness speech recognition. Robust features and model adaptation methods are adopted to reduce the environmental impact of testing. Finally, we build a more personalized system for providing exclusive services. By the speaker recognition techniques, we also expect to strengthen the recognition system performance further.

並列關鍵字

Speech Recognition System

參考文獻

[1] M.-W. Koo, C.-H. Lee, and B.-H Juang, “ Speech Recognition and Utterance Verification Based on a Generalized Confidence Score, ” IEEE Trans .on Speech and Audio Processing, vol. 9, No. 8, Nov. 2001.

[2] Chi-Min Liu, Chin-Chih Chiu, and Hung-Yuan Chang “ Design of Vocabulary -Independent Mandarin Keyword Spotters, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 4, July 2000.

[3] B. H. Juang, “ The past, present, and future of speech processing, ” IEEE Trans. on Signal Processing, pp. 24-28, May 1998.

[6] R. Vergin and D. O’Shaughnessy and A. Farhat, “Generalized Mel Frequency Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol. 7, no. 5, pp. 525-532, September 1999.

[7] John R. Deller, Jr., John G. Proakis, John H. L. Hansen, “ Discrete-Time Processing of Speech Signals ”, 1987

被引用紀錄

呂易宸（2011）。語音門禁系統〔碩士論文，國立中央大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0031-1903201314420437

許晏銘（2013）。基於動態規劃之機器學習方法於小字彙DTW語音辨識系統之研究〔碩士論文，國立虎尾科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0028-3007201315014600

國際替代計量

多媒體應用之語音辨識系統

未授權

主題瀏覽