透過您的圖書館登入
IP:18.223.106.232
  • 期刊
  • OpenAccess

以深層類神經網路標記中文階層式多標籤語意概念

Hierarchical Multi-Label Chinese Word Semantic Labeling using Deep Neural Network

摘要


傳統上對超過100個階層式標籤分類可以使用扁平(flatten)標籤做分類,但如此會喪失架構樹(taxonomy)的階層資訊。本研究旨在對廣義知網中文詞彙做概念分類與標記,提出考慮廣義知網架構樹階層關係之深層類神經網路訓練方法,其輸入為詞彙樣本點的詞向量,詞向量方面本研究亦提出考慮上下文前後關係之2-Bag Word2Vec,而各階層的訓練結果有不同的重要性,所以在模型的最後使用最小分類誤差法,賦予各階層在測試階段時不同的權重。實驗結果顯示階層式(hierarchical)分類預測正確率會比扁平分類還高。

並列摘要


Traditionally, classifying over 100 hierarchical multi-labels could use flatten classification, but it will lose the taxonomy structure information. This paper aimed to classify the concept of word in E-HowNet and proposed a deep neural network training method with hierarchical relationship in E-HowNet taxonomy. The input of neural network is word embedding. About word embedding, this paper proposed order-aware 2-Bag Word2Vec. Experiment results shown hierarchical classification will achieved higher accuracy than flatten classification.

參考文獻


Huang, S.-L.,Chung, Y.-S.,Chen, K.-J.(2008).E-HowNet: the expansion of HowNet.Proceedings of the First National HowNet Workshop.(Proceedings of the First National HowNet Workshop).
Liu, Q.、Li, S.-j.(2002)。Word Similarity Computing Based on How-net。International Journal of Computational Linguistics and Chinese Language Processing。7(2),59-76。
Mikolov, T.,Chen, K.,Corrado, G.,Dean, J.(2013).Efficient Estimation of Word Representations in Vector Space.Proceedings of Workshop at ICLR.(Proceedings of Workshop at ICLR).
Mikolov, T.,Sutskever, I.,Chen, K.,Corrado, G.,Dean, J.(2013).Distributed Representations of Words and Phrases and their Compositionality.Proceedings of NIPS 2013.(Proceedings of NIPS 2013).
Su, W.-F.,Li, S.-Z.,Li, T.-Q.(2002).A Module of Automatic Chinese Documents Classification Based on Concept.Computer Engineering and Applications.6

延伸閱讀