透過您的圖書館登入
IP:18.117.111.129
  • 期刊
  • OpenAccess

Analyzing the Morphological Structures in Seediq Words

摘要


NLP techniques are efficient to build large datasets for low-resource languages. It is helpful for preservation and revitalization of the indigenous languages. This paper proposes approaches to analyze morphological structures in Seediq words automatically as the first step to develop NLP applications such as machine translation. Word inflections in Seediq are plentiful. Sets of morphological rules have been created according to the linguisitic features provided in the Seediq syntax book (Sung, 2018) and based on regular morpho-phonological processing in Seediq, a new idea of "deep root" is also suggested. The rule-based system proposed in this paper can successfully detect the existence of infixes and suffixes in Seediq with a precision of 98.88% and a recall of 89.59%. The structure of a prefix string is predicted by probabilistic models. We conclude that the best system is bigram model with back-off approach and Lidstone smoothing with an accuracy of 82.86%.

參考文獻


Li, P. J.-K. (1977). Morphophonemic Alternations in Formosan Languages. Bulletin of the Institute of History and Philology (中央研究院歷史語言研究所集刊), 48(3), 375-413. doi: 10.6355/BIHPAS.197709.0375
Li, P. J.-K. (1981). Reconstruction of Proto-Atayalic Phonology. Bulletin of the Institute of History and Philology (中央研究院歷史語言研究所集刊), 52(2), 235-301. doi: 10.6355/BIHPAS.198106.0235
Chiang, C.-Y. (2018). Cross-Dialect Adaptation Framework for Constructing Prosodic Models for Chinese Dialect Text-to-Speech Systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(1), 108-121. doi: 10.1109/TASLP.2017.2762432
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1412-1421. doi: 10.18653/v1/D15-1166
Bahdanau, D., Cho, K.H., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015).

延伸閱讀