透過您的圖書館登入
IP:18.218.79.102
  • 學位論文

多條件耦合之半監督式學習於中文知識擷取之研究

Coupled Semi-Supervised Learning for Chinese Knowledge Extraction

指導教授 : 許永真
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


一個豐富的知識庫對於具有人工智慧的系統有很大的幫助,但是建立一個完整的知識庫卻需要花費無數的人力和時間。 在「自動化的知識收集與萃取」這個領域,Never Ending Language Learning (NELL) 做了一個很好的示範,但是它在中文語言處理上的能力有限。 本論文提出一個自動化中文知識萃取系統,我們發現在中文語句中,同一個類別的名詞常會和某些特定的動詞一起出現,我們利用這些動詞建立模版,來找到更多相同類別的名詞。 我們結合 NELL 下的跨語言知識蒐集系統,以提高整體的正確率。 最後,實驗證明我們的系統可以承載大規模的自動化中文知識蒐集。

並列摘要


Robust intelligent applications benefit from rich knowledge bases. Building a rich and complete knowledge base is a time-comsuming and labor-intensive task. Never Ending Language Learning (NELL) is a great demonstration for large-scale automatic knowledge extraction, but unfortunately some components in NELL are not suitable to deal with Chinese. This thesis presents a Coupled Chinese Pattern Learner (CCPL), which extracts knowledge by textual patterns on relationships between nouns and verbs in Chinese sentences. We also implement Coupled Set Expander for Any Language (CSEAL) to collaborate with CCPL. The experiments show our system is capable of large-scale learning, and preserves high accuracy in automatic extraction for Chinese knowledge.

參考文獻


[11] D. Lenat. Cyc: A large-scale investment in knowledge infrastructure. Communica-
[12] H. Liu and P. Singh. Conceptnet: A practical commonsense reasoning toolkit. BT
[13] G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM,
38:39–41, 1995.
[14] H. Nakamura. Radix tree naive implementation of radix tree for ruby, 2013. https:

延伸閱讀