TAICAR-The Collection and Annotation of an In-Car Speech Database Created in Taiwan

This paper describes a project that aims to create a Mandarin speech database for the automobile setting (TAICAR). A group of researchers from several universities and research institutes in Taiwan have participated in the project. The goal is to generate a corpus for the development and testing of various speech-processing techniques. There are six recording sites in this project. Various words, sentences, and spontaneously queries uttered in the vehicular navigation setting have been collected in this project. A preliminary corpus of utterances from 192 speakers was created from utterances generated in different vehicles. The database contains more than 163,000 files, occupying 16.8 gigabytes of disk space.

並列關鍵字

TAICAR ； in-car speech ； speech database ； multi-channel recording ； corpus collection and annotation

參考文獻

Bernstein,J.,K. Taussig,J. Godfrey(1994).MACROPHONE: An American English Telephone Speech Corpus for Polyphone Project.(In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing).

Google Scholar

Heuvel,H.,A. Bonafonte,J. Boudy,S. Dufour,P. Lockwood,A. Moreno,G. Richard(1999).SpeechDat-Car: Towards a collection of speech databases for automotive environments.(In Proceedings of the Nokia-C0ST249 Workshop on Robust Methods for Speech Recognition in Adverse Conditions).

Google Scholar

Heuvel,H.,J. Boudy,R. Comeyne,S. Euler,A. Moreno,G. Richard(1999).The SpeechDat-Car multilingual speech databases for in-car applications: Some first validation results.(In Proceedings of 6th European Conference on Speech Communication and Technology).

Google Scholar

Itakura,F.(2001).Multi-Media Data Collection for In-Car Speech Communication-Ongoing Data Collection and Preliminary Results.(In Proceedings of International Workshop on Hand-Free Speech communication).

Google Scholar

Kudo,I.,T. Nakama,N. Arai,N. Fujimura(1994).The Database Collection of Voice Across Japan (VAT) Project.(In Proceedings of 2nd International Conference on Spoken Language Processing).

Google Scholar

國際替代計量

TAICAR-The Collection and Annotation of an In-Car Speech Database Created in Taiwan

全文下載

主題瀏覽