本論文在建立一套以PDA為操作介面之語音控制系統,本系統在Window CE作業系統下,以Pocket PC為發展平台,利用eMbedded Visual C++ 3.0和MFC等工具開發。 首先利用能量與越零率的語音訊號切割技術,把語者聲音的部分擷取出來。在作時頻分析時利用小波包(Wavelet Packet)多頻帶解析的能力,架構適當出小波包分解樹來進行特徵參數抽取。訓練時經過二元分裂法(Binary splitting)建立向量量化碼本,並以離散型隱藏式馬可夫模型(DHMM, Discrete Hidden Markov Models)建立語音模型後,再使用波氏演算法(BaumWelch Algorithm)做調適。辨識時採用維特比演算法(Viterbi algorithm)來計算最佳辨認的機率。 本系統透過Windows API與POOM(Pocket Outlook Object Model)程式介面,來執行辨識後的指令動作。其功能包括查詢聯絡人的各種資訊、預約行程,以及連線上網等…動作。
In this thesis, a speech recognition for controlling PDA is implemented. The System is build with components that include Pocket PC platform, Windows CE, eMbedded Visual C++ 3.0, and MFC. First, the speech of speaker segmented by utilized the energy detecting and zero crossing technology. During Time-Frequency analysis stage, Utilizing the ability of Multi-Band resolution of wavelet packet Constructed the wavelet packet decomposition tree to extract speech feature. During training stage, to build vector quantization codebook used Binary splitting ,and then to build speech model use Discrete Hidden Markov Models, and then to adapt speech model employ BaumWelch Algorithm. During recognition stage,the system Used Viterbi algorithm to calculate the best probability. The system used the interface of the Windows API and POOM to implement the instructs recognized. The function of system include inquiry, making appointments and exploring internet, etc.