傳統上對超過100個階層式標籤分類可以使用扁平(flatten)標籤做分類,但如此會喪失架構樹(taxonomy)的階層資訊。本研究旨在對廣義知網中文詞彙做概念分類與標記,提出考慮廣義知網架構樹階層關係之深層類神經網路訓練方法,其輸入為詞彙樣本點的詞向量,詞向量方面本研究亦提出考慮上下文前後關係之2-Bag Word2Vec,而各階層的訓練結果有不同的重要性,所以在模型的最後使用最小分類誤差法,賦予各階層在測試階段時不同的權重。實驗結果顯示階層式(hierarchical)分類預測正確率會比扁平分類還高。
Traditionally, classifying over 100 hierarchical multi-labels could use flatten classification, but it will lose the taxonomy structure information. This paper aimed to classify the concept of word in E-HowNet and proposed a deep neural network training method with hierarchical relationship in E-HowNet taxonomy. The input of neural network is word embedding. About word embedding, this paper proposed order-aware 2-Bag Word2Vec. Experiment results shown hierarchical classification will achieved higher accuracy than flatten classification.