A Fast Framework for the Constrained Mean Trajectory Segment Model by Avoidance of Redundant Computation on Segment

The segment model (SM) is a family of methods that use the segmental distribution rather than frame-based density (e.g. HMM) to represent the underlying characteristics of the observation sequence. It has been proved to be more precise than HMM. However, their high level of complexity prevents these models from being used in practical systems. In this paper, we propose a framework that can reduce the computational complexity of the Constrained Mean Trajectory Segment Model (CMTSM), one type of SM, by fixing the number of regions in a segment so as to share the intermediate computation results. Our work is twofold. First, we compare the complexity of SM with that of HMM and point out the source of the complexity in SM. Secondly, a fast CMTSM framework is proposed, and two examples are used to illustrate this framework. The fast CMTSM achieves a 95.0% string accurate rate in the speaker-independent test on our mandarin digit string data corpus, which is much higher than the performance obtained with HMM-based system. At the mean time, we successfully keep the computation complexity of SM at the same level as that of HMM.

並列關鍵字

Speech Recognition ； Segment Model ； Mandarin Digit String Recognition

參考文獻

Deng, L.,M. Aksmanovic, X. Sun,C. Wu(1994).Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Non-stationary States.(IEEE Trans. Speech Audio Processing).

Google Scholar

Deng, Y.,T. Huang,B. Xu(2000).Towards high performance continuous mandarin digit string recognition.(In Proceeding of Int. Conf on Spoken Language Processing).

Google Scholar

Digalakis, V.,M. Ostendorf,J. Rohlicek(1992).Fast Algorithms for phone classification and recognition using Segment-based Models.(IEEE Trans. Speech Audio Processing).

Google Scholar

Gish, H.,K.Ng,J. Rohlicek(1992).Secondary Processing using Speech Segments for an HMM Word Spotting System.(In Proceeding of Int. Conf on Spoken Language Processing).

Google Scholar

Glass, J.(2003).A probabilistic framework for segment-based speech recognition.(Computer Speech and Language).

Google Scholar

國際替代計量

A Fast Framework for the Constrained Mean Trajectory Segment Model by Avoidance of Redundant Computation on Segment

全文下載

主題瀏覽