透過您的圖書館登入
IP:3.15.168.194
  • 學位論文

利用類神經網路正則化相異實體名稱

Neural Normalization of Diverse Entity Labels

指導教授 : 鄭卜壬

摘要


實體標籤,用於表示對實體的稱呼或描述,其格式通常沒有一致的規範。對於標籤的多樣化,可以大致分成相異類別以及相異風格。對不同類別的實體,例如學校或銀行,因為其慣用名稱通常有明顯的差異,因此標籤也有相對應的差別;而對於同樣類別的實體,其標籤卻也可能因為用途與來源的不同而呈現相異風格,例如對於學校正式與非正式的稱呼。本文中所使用的數據集由大量電話號碼組成,這些電話號碼作為「實體」,包含了各式各樣的實體類別,例如政府機關、餐廳、公司行號等等;而每個電話號碼的擁有者可能會有多個不同來源的名稱,因此格式用法也大相逕庭,可以看作為相異風格的「標籤」,因此我們所處理的該資料集可以說是同時涵蓋了兩種多樣化的概念。   對於多樣化的實體標籤,我們希望透過類神經網路來進行正規化,使得每個實體能獲得單一的標籤作為代表。網路模型的部分,使用文本摘要模型作為基本框架,並在訓練過程中使用加權的損失函數,令目標函數能夠更加適合我們的任務。最後引入多任務學習的方法,利用輔助任務來幫助模型學習。   最後在實驗的部份,我們會提出針對本文的資料集所設計的前處理方法。接著比較幾種模型和訓練方式的表現差異,觀察輸出結果、探討模型的表現並解釋其原因,以證明本文提出的方法的效果。同時我們也會對錯誤的部份進行更深入的觀察及討論。

並列摘要


Entity labels are used to indicate the name or description of entities, and their format usually do not have consistent specification. The diversification of labels can be roughly divided into different categories and different styles. For entities of different categories, such as schools or banks, there are usually great differences in their idiomatic names, so the labels also have differences. And for entities of the same category, their labels may also show different styles due to different uses and sources, such as formal and informal names for schools. In this paper, the dataset consists of large number of telephone numbers, which are “entities” belonging to diversified entity categories, such as government agencies, restaurants, companies, etc. And each phone number owner may have multiple names from different sources, which are “labels” in different styles. Therefore, our dataset covers the two kinds of diversification at the same time. For diverse entities labels, we hope to normalize them through neural networks, so that each entity can obtain a single label as a representative. We take text summarization model as the basic framework, and a weighted loss function is used in the training process to make the objective function more suitable for our task. Finally, multi-task learning is introduced, and an auxiliary task are used to help the model to learn better. Finally, we will propose a pre-processing method for our dataset. Then compare the performance of different models, observe the output texts, and give an explanation, to prove the effect of the methods we proposed. Also, we will conduct more in-depth observation and discussion on the wrong part.

參考文獻


1. Sutskever, I., Vinyals, O., and Le, Q. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems.
2. Bahdanau, D., Cho , K., Bengio ,Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 [cs.CL].
3. Junczys-Dowmunt, M., Grundkiewicz, R., Guha, S., Heafield, K. (2018). Approaching neural grammatical error correction as a low-resource machine translation task. arXiv preprint arXiv:1804.05940.
4. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi (2019). BERTScore: Evaluating Text Generation with BERT. arXiv preprint arXiv:1904.09675.
5. Roman Grundkiewicz, Marcin Junczys-Dowmunt (2018). Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation. arXiv preprint arXiv:1804.05945 [cs.CL].

延伸閱讀