基於大型語言模型與知識圖譜交互合作之推理問答系統

雖然大型語言模型（LLM）在各種任務中取得了顯著成就，但它們在經常面臨幻覺問題，並欠缺特定領域知識推理與解釋的能力。方法如微調（Fine-Tuning）、低秩適應（Low-Rank Adaptation, LoRA）、檢索增強生成（Retrieval-Augmented Generation, RAG）等，旨在提升大語言模型（LLM）的回答準確性。微調需要極為大量的資料和計算資源才能對LLM的參數產生影響，但即使如此，仍無法解決LLM的黑箱問題。LoRA作為微調的替代方案，在減少資料量和計算資源的同時，亦能取得不錯的效果，但本質上仍無法解決LLM的黑箱問題。RAG作為最廣泛應用的方法，透過檢索資料庫來獲得正確答案，從而解決LLM的黑箱問題，然而，這種方法在處理需要推理的複雜問題時效果不佳。本論文的目的是解決大語言模型（LLM）無法解釋、知識圖譜（Knowledge Graph, KG）過大以及無法進行推理的問題。由於知識圖譜作為儲存廣泛知識的載體，包含數億個節點和邊，對於推理鏈的運行構成了極大負擔。本論文在第一階段，先由LLM生成可能的起始節點，然後使用獎勵斯坦納樹（Prize-Collecting Steiner Tree, PCST）演算法來去除KG中比較不相關的資訊，從而生成一個更精簡的子樹，解決KG過大的問題。其次，在第二階段中，本論文開發了一個LLM與KG相互協作的系統，以交互方式探索KG中的相關實體和關係，並根據檢索到的知識進行推理，從而解決LLM的不可解釋性及KG的推理問題。實驗顯示，在兩個資料集上各項指標皆獲得成效。

關鍵字

深度學習；人工智慧；自然語言處理；大語言模型；知識圖譜；問答系統

並列摘要

Although large language models (LLMs) have achieved remarkable success in various tasks, they often face the problem of hallucinations and lack the ability to perform domain-specific knowledge reasoning and explanation. Methods such as Fine-Tuning, Low-Rank Adaptation (LoRA), and Retrieval-Augmented Generation (RAG) aim to improve the accuracy of LLMs' responses. Fine-Tuning requires a significant amount of data and computational resources to impact the parameters of LLMs, but even with these resources, it fails to resolve the black-box nature of LLMs. LoRA, as an alternative to Fine-Tuning, achieves satisfactory results with reduced data and computational requirements; however, it fundamentally does not solve the black-box issue of LLMs. RAG, the most widely applied method, attempts to resolve the black-box issue by retrieving the correct answer from a database, but it is less effective when dealing with complex problems requiring reasoning. This thesis aims to address the issues of the lack of explainability in large language models (LLMs), the excessive size of knowledge graphs (KGs), and their inability to perform reasoning. As a carrier of vast knowledge, a KG contains hundreds of millions of nodes and edges, posing a significant challenge to the execution of reasoning chains. In the first phase of this study, the LLM generates possible starting nodes, and the Prize-Collecting Steiner Tree (PCST) algorithm is used to filter out less relevant information from the KG, thereby generating a more concise subgraph to tackle the problem of the KG's excessive size. In the second phase, this thesis develops a system that facilitates the collaboration between the LLM and the KG to explore relevant entities and relationships interactively within the KG, allowing reasoning based on the retrieved knowledge. This approach addresses the issues of LLM's lack of explainability and the reasoning problems associated with KGs. Experimental results demonstrate the effectiveness of the proposed approach across various metrics on two datasets.

並列關鍵字

Deep Learning ； Artificial Intelligence ； Natural Language Processing ； Large Language Models ； Knowledge Graph ； Question-Answering System

參考文獻

[1] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, "Large language models are zero-shot reasoners," Advances in neural information processing systems, vol. 35, pp. 22199-22213, 2022.

Google Scholar

[2] X. Wang et al., "Self-consistency improves chain of thought reasoning in language models," arXiv preprint arXiv:2203.11171, 2022.

Google Scholar

[3] S. Yao et al., "Tree of thoughts: Deliberate problem solving with large language models," Advances in Neural Information Processing Systems, vol. 36, 2024.

Google Scholar

[4] N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235, 2023.

Google Scholar

[5] Y. Wang et al., "Two-stage LLM Fine-tuning with Less Specialization and More Generalization," arXiv preprint arXiv:2211.00635, 2022.

Google Scholar

延伸閱讀

全文下載

主題瀏覽