透過您的圖書館登入
IP:3.128.199.175
  • 學位論文

以遠程監督式學習從中文文本進行關係自動擷取

Automatic Relation Extraction from a Chinese Corpus through Distant-supervised Learning

指導教授 : 柯佳伶
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文研究從中文文本進行關係擷取,以類神經網路架構為基礎,採用遠程監督式學習的概念,預測文本句子中是否具有特定關係,並擷取出句子中具有此特定關係的實體詞配對。本論文將詞嵌入向量和詞性嵌入向量作為模型輸入特徵,分別訓練關係偵測模型和鑑別模型,前者使用具有時序性的雙向長短期記憶網路模型,用來預測文本句子中是否具有特定關係,並在模型中使用多維注意力機制,針對符合某特定關係的句子們找出句子中相對重要的字作為候選實體詞;後者使用實體詞配對之向量差,利用鑑別模型輸出該實體詞配對是否具有某特定關係。經過上述兩個模型得到的結果,透過回饋學習機制,增加關係偵測模型的訓練資料,並調整關係偵測模型的訓練參數以提升關係分類效果。

並列摘要


In this paper, we study the problem of relation extraction from a Chinese corpus through distant-supervised learning. We constructed two models based on the recurrent neural networks to solve the problem. The two models use the word embedding and POS embedding as inputs. The first one is the relation detection model, which detects the relation of a sentence and selects the candidate entity words with multi-level structured (2-D matrix) attention mechanism. The candidate entity words will be combined to be entity pairs, which are inputted to the discriminative model. The second one is the discriminative model, which uses the vector difference of an entity pair to determine if an entity pair satisfies a relation. The results of the discriminative model can find more entity pairs of relations. These pairs can be used as additional training data of the relation detection model to improve the performance of the relation detection model through the feedback for learning.

參考文獻


[1] E. Agichtein, L. Gravano. (2000). Snowball: Extracting Relations from Large Plain-Text Collections. In Proceedings of the 5th ACM International Conference on Digital Libraries.
[2] A. Bordes, N. Usunier, A. Garcia-Dur´an. (2013). Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the 2013 Neural Information Processing Systems Conference. (NIPS 2013)
[3] S. Brin. (1998). Extracting patterns and relations from the World-Wide Web. In Proceedings of the 1998 International Workshop on the Web and Databases (WebDB’98)
[4] B. Chiang. (2018). Automatic Detection of User’s Query Intentions for Community Question Answering. In Department of Computer Science and Information Engineering, National Taiwan Normal University.
[5] R. Hoffmann, C. Zhang, X. Ling, L. Zettlemoyer, and D.S. Weld. (2011). Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics.

延伸閱讀