透過您的圖書館登入
IP:13.59.177.14
  • 期刊
  • OpenAccess

Automatic Recognition of Cantonese-English Code-Mixing Speech

並列摘要


Code-mixing is a common phenomenon in bilingual societies. It refers to the intra-sentential switching of two different languages in a spoken utterance. This paper presents the first study on automatic recognition of Cantonese-English code-mixing speech, which is common in Hong Kong. This study starts with the design and compilation of code-mixing speech and text corpora. The problems of acoustic modeling, language modeling, and language boundary detection are investigated. Subsequently, a large-vocabulary code-mixing speech recognition system is developed based on a two-pass decoding algorithm. For acoustic modeling, it is shown that cross-lingual acoustic models are more appropriate than language-dependent models. The language models being used are character tri-grams, in which the embedded English words are grouped into a small number of classes. Language boundary detection is done either by exploiting the phonological and lexical differences between the two languages or is done based on the result of cross-lingual speech recognition. The language boundary information is used to re-score the hypothesized syllables or words in the decoding process. The proposed code-mixing speech recognition system attains the accuracies of 56.4% and 53.0% for the Cantonese syllables and English words in code-mixing utterances.

參考文獻


Auer, P.(1998).Code-Switching in Conversation: Language, Interaction and Identity.London:Routledge.
Chan, H. S.(1992).Code-mixing in Hong Kong Cantonese-English Bilinguals: Constraints and Processes.Chinese University of Hong Kong.
Chan, J. Y. C.,Ching, P. C.,Lee, T.,Meng, H.(2004).Detection of language boundary in code-switching utterances by bi-phone probabilities.Proceeding of the 5th International Symposium on Chinese Spoken Language Processing.(Proceeding of the 5th International Symposium on Chinese Spoken Language Processing).:
Chan, J. Y. C.(2005).Automatic Speech Recognition of Cantonese-English Code-Mixing Utterances.Chinese University of Hong Kong.
Chan, J. Y. C.,Ching, P. C.,Lee, T.(2005).Development of a Cantonese-English code-mixing speech corpus.Proceeding of Eurospeech.(Proceeding of Eurospeech).:

被引用紀錄


Tseng, P. H. (2011). 無線定位追蹤訊號處理技術 [doctoral dissertation, National Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2011.00426
蔡財祿(2010)。國客雙語語音辨認〔碩士論文,國立交通大學〕。華藝線上圖書館。https://doi.org/10.6842/NCTU.2010.00678
錢銅岳(2013)。建構於雲端測試平台上資源監控與排程方法〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://doi.org/10.6841/NTUT.2013.00568
吳依倫(2011)。多語夾雜環境下未知詞擷取之研究〔碩士論文,元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2011.00278
Heidel, A. (2015). 語碼切換語音辨識中用以恢復第二語言之多層次線索 [doctoral dissertation, National Taiwan University]. Airiti Library. https://doi.org/10.6342/NTU.2015.02621

延伸閱讀