透過您的圖書館登入
IP:3.146.255.157
  • 期刊
  • OpenAccess

A Fast Framework for the Constrained Mean Trajectory Segment Model by Avoidance of Redundant Computation on Segment

並列摘要


The segment model (SM) is a family of methods that use the segmental distribution rather than frame-based density (e.g. HMM) to represent the underlying characteristics of the observation sequence. It has been proved to be more precise than HMM. However, their high level of complexity prevents these models from being used in practical systems. In this paper, we propose a framework that can reduce the computational complexity of the Constrained Mean Trajectory Segment Model (CMTSM), one type of SM, by fixing the number of regions in a segment so as to share the intermediate computation results. Our work is twofold. First, we compare the complexity of SM with that of HMM and point out the source of the complexity in SM. Secondly, a fast CMTSM framework is proposed, and two examples are used to illustrate this framework. The fast CMTSM achieves a 95.0% string accurate rate in the speaker-independent test on our mandarin digit string data corpus, which is much higher than the performance obtained with HMM-based system. At the mean time, we successfully keep the computation complexity of SM at the same level as that of HMM.

參考文獻


Deng, L.,M. Aksmanovic, X. Sun,C. Wu(1994).Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Non-stationary States.(IEEE Trans. Speech Audio Processing).
Deng, Y.,T. Huang,B. Xu(2000).Towards high performance continuous mandarin digit string recognition.(In Proceeding of Int. Conf on Spoken Language Processing).
Digalakis, V.,M. Ostendorf,J. Rohlicek(1992).Fast Algorithms for phone classification and recognition using Segment-based Models.(IEEE Trans. Speech Audio Processing).
Gish, H.,K.Ng,J. Rohlicek(1992).Secondary Processing using Speech Segments for an HMM Word Spotting System.(In Proceeding of Int. Conf on Spoken Language Processing).
Glass, J.(2003).A probabilistic framework for segment-based speech recognition.(Computer Speech and Language).

延伸閱讀