透過您的圖書館登入
IP:3.146.152.99
  • 期刊
  • OpenAccess

Construction and Automatization of a Minnan Child Speech Corpus with some Research Findings

並列摘要


Taiwanese Child Language Corpus (TAICORP) is a corpus based on spontaneous conversations between young children and their adult caretakers in Minnan (Taiwan Southern Min) speaking families in Chiayi County, Taiwan. This corpus is special in several ways: (1) It is a Minnan corpus; (2) It is a speech-based corpus; (3) It is a corpus of a language that does not yet have a conventionalized orthography; (4) It is a collection of longitudinal child language data; (5) It is one of the largest child corpora in the world with about two million syllables in 497,426 lines (utterances) based on about 330 hours of recordings. Regarding the format, TAICORP adopted the Child Language Data Exchange System (CHILDES) [MacWhinney and Snow 1985; MacWhinney 1995] for transcribing and coding the recordings into machine-readable text. The goals of this paper are to introduce the construction of this speech-based corpus and at the same time to discuss some problems and challenges encountered. The development of an automatic word segmentation program with a spell-checker is also discussed. Finally, some findings in syllable distribution are reported.

參考文獻


Tsay, J.(2005).Documentation of Chaiyi City, the Language and Literature Volume.Chiayi, Taiwan:Chiayi City Hall.
Boersma, P.,C. Levelt(2000).Gradual Constraint-Ranking Learning Algorithm Predicts Acquisition Order.The Proceedings of the Thirtieth Annual Child Language Research Forum.(The Proceedings of the Thirtieth Annual Child Language Research Forum).
Chen, K.-J.,C.-R. Huang,L.-P. Chang,H.-L. Hsu(1996).SINICA CORPUS: Design Methodology for Balanced Corpora.Language, Information, and Computation.11,167-176.
CKIP(1993).Technical Report.Taipei:Institute of Information Science Academia Sinica.
CKIP(1998).Technical Report.Taipei:Institute of Information Science Academia Sinica.

被引用紀錄


Iunn, U. G. (2009). 台語文處理技術:以變調及詞性標記為例 [doctoral dissertation, National Taiwan University]. Airiti Library. https://doi.org/10.6342/NTU.2009.00377
Yuan, C. M. (2015). 閩南語幼兒塞擦音習得研究 [master's thesis, National Chung Cheng University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0033-2110201614042332

延伸閱讀