透過您的圖書館登入
IP:3.134.102.182
  • 期刊

Chinese Word Segmentation and Recognition Based on Separable Convolution Bidirectional Long Short-Term Memory and Feature Point

摘要


Chinese automatic word segmentation is the premise of Chinese information processing, which is widely used in Chinese full-text retrieval, Chinese automatic full-text translation, Chinese text-to-speech conversion (TTS) and other fields. A dictionary plays an important role in Chinese word segmentation. The advantages and disadvantages of the word segmentation mechanism directly affect the speed and efficiency of Chinese word segmentation. Therefore, we propose a deep learning method for Chinese word segmentation. First, a separable convolution bidirectional long and short-term memory condition random field word segmentation model with feature points containing dictionary features is constructed. The model parameters are obtained by training on the existing word segmentation corpus. Then, the software engineering field text is used as the small-scale word segmentation training corpus. The word segmentation model of general corpus is fine-tuned. The experimental results show that the transfer learning reduces the iteration times of the domain segmentation model. Meanwhile, compared with other Chinese word segmentation models, the proposed model can reduce the corpus labeling time in training process and realize the cross-domain transfer learning of word segmentation model.

延伸閱讀