透過您的圖書館登入
IP:3.138.102.178
  • 期刊
  • OpenAccess

TAICAR-The Collection and Annotation of an In-Car Speech Database Created in Taiwan

並列摘要


This paper describes a project that aims to create a Mandarin speech database for the automobile setting (TAICAR). A group of researchers from several universities and research institutes in Taiwan have participated in the project. The goal is to generate a corpus for the development and testing of various speech-processing techniques. There are six recording sites in this project. Various words, sentences, and spontaneously queries uttered in the vehicular navigation setting have been collected in this project. A preliminary corpus of utterances from 192 speakers was created from utterances generated in different vehicles. The database contains more than 163,000 files, occupying 16.8 gigabytes of disk space.

參考文獻


Wang,H.C.(1997).MAT - A Project to Collect Mandarin Speech Data Through Telephone Networks in Taiwan.(Computational Linguistics and Chinese Language Processing).
Bernstein,J.,K. Taussig,J. Godfrey(1994).MACROPHONE: An American English Telephone Speech Corpus for Polyphone Project.(In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing).
Heuvel,H.,A. Bonafonte,J. Boudy,S. Dufour,P. Lockwood,A. Moreno,G. Richard(1999).SpeechDat-Car: Towards a collection of speech databases for automotive environments.(In Proceedings of the Nokia-C0ST249 Workshop on Robust Methods for Speech Recognition in Adverse Conditions).
Heuvel,H.,J. Boudy,R. Comeyne,S. Euler,A. Moreno,G. Richard(1999).The SpeechDat-Car multilingual speech databases for in-car applications: Some first validation results.(In Proceedings of 6th European Conference on Speech Communication and Technology).
Itakura,F.(2001).Multi-Media Data Collection for In-Car Speech Communication-Ongoing Data Collection and Preliminary Results.(In Proceedings of International Workshop on Hand-Free Speech communication).

延伸閱讀