透過您的圖書館登入
IP:18.118.254.28
  • 學位論文

利用深度學習演算法分析中文自然語言利用遞迴神經網路 (RNN) 分析淡江大學的學則

Chinese Natural Language Processing Model Based on Deep Learning Tamkang University Academic Policies Analysis using a RNN

指導教授 : 楊定揮

摘要


現在的網路資訊都很方便,很多大大小小的問題都能在網路上所找到,但是,有些問題是無法藉由關鍵字匹對找到答案,或許是資料過於龐大而每有發現到它,亦或是可能有相見似的答案而搜尋引擎無法回答出來。所以我們想建立一個小規模的搜尋引擎去實現能找出相似的答案符合我們心中的問題。我們將藉由淡江大學學則作為資料庫去研發出一個針對這個資料庫的小型搜尋引擎,只要打出問題可以找到資料庫給你的答案。本論文針對深度學習下神經網路的多文本分析問題,藉由介紹機器語言一些處理的技巧,從文章上抽取關鍵的詞語,詞語跟詞語間的關聯性跟如何將這些詞語集合的重要主題,應用在淡江大學學則上。最後建立的預測模型的結果並不完美,由損失函數及預測準確率的趨勢圖可知道準確率並不是很高,這個機器回答的問題並不是我心中所想的,可能是這個搜尋引擎的資料庫並不龐大,或是各個類別間底下的學則彼此不夠獨立,導致所問的問題可能會跑到其他的類別上,如何除錯這些問題就是以後的研究。創立一個全新的搜尋引擎,方便去讓我們更能找到所需要的東西,而其研究過程也能適用於其他文章或是網站上,讓各個地方都能運用上此研究。語言處理上還運用到了分詞,統計建立模型在這個神經網路上的研究是個重要的研究主題。

並列摘要


The current Internet information is very convenient and many problems can be found on the Internet. However, some questions cannot be answered by keyword matching. Perhaps the data is too large and every time it is found or there may be similar answers that search engines cannot answer. So we want to build asmall-scale search engine to find similar answers to the questions in our minds. We will use Tamkang University Academic Policies as a database to develop a small search engine for this database. Just type in the question to find the answer that the database gives you. This thesis focuses on the multi-text analysis of neural networks under deep learning. By introducing some processing techniques of machine language, it extracts key words from the article, the relationship between words and words, and the important topics of how to group these words together. Applied to the academic rules of Tamkang University. The result of the final prediction model is not perfect. From the loss function and the trend graph of the prediction accuracy, we can see that the accuracy is not very high. The question answered by this machine is not what I thought. It may be that the database of this search engine is not huge, or the underlying learning among the various categories is not sufficiently independent of each other, causing the questions asked may go to other categories. How to debug these problems is for future research. Create a brand new search engine so that we can find what we need more easily, and its research process can also be applied to other articles or websites, so that this research can be used everywhere. Word segmentation is also used in language processing, and the study of statistical modeling on this neural network is an important research topic.

參考文獻


[1] https://aclanthology.org/O03-1014.pdf
[2] 施威銘, 人深度學習是機器學習中的一個子領域..., 2019.
[3] https://en.wikipedia.org/wiki/Artificial_neural_network
[4] https://iter01.com/583035.html
[5] https:// brohrer.mcknote.com/ zh-Hant/ how_machine_learning _works/

延伸閱讀