RAG技術的應用與效能評估-以圖書資訊學領域為例

本研究針對圖書資訊學領域，探討檢索增強生成（Retrieval-Augmented Generation, RAG）技術的應用與效能評估。現有的大型語言模型（如GPT-3）雖展現卓越的文本生成能力，但在面對專業問題時，易受人工智慧幻覺影響，導致生成內容的準確性和相關性不足。RAG技術結合檢索與生成兩個階段，通過檢索外部資料輔助文本生成，提升了內容的專業性與上下文的連貫性，特別適合應用於資訊需求高且專業性強的領域。本研究採用AI生成問題並使用RAG進行回答，結合ChatGPT與人工的評分數據，透過多指標（如F1分數、準確率）對RAG效能進行量化分析。結果顯示，RAG能有效克服傳統LLM在專業領域中的不足，在準確性、相關性和上下文匹配上表現卓越。同時，採用Ragas生成測試集以另一種客觀方式進行評估，進一步驗證RAG技術的效能。然而，研究也發現部分生成回答在忠實度上存在改進空間，特別是在資料支持不足或背景資訊偏差的情境中。本研究證實，RAG技術能顯著提升大型語言模型在圖書資訊學領域文本生成的質量，為專業問題解決提供了更準確與可靠的工具，並為相關領域的研究與應用提供了重要的參考依據。

關鍵字

檢索增強生成；大型語言模型

並列摘要

This study focuses on the application and performance evaluation of Retrieval-Augmented Generation (RAG) in the field of Library and Information Science. While existing LLMs such as GPT-3 and BERT demonstrate remarkable capabilities in text generation, they are prone to inaccuracies and lack relevance when addressing domain-specific issues, often due to the phenomenon of artificial intelligence hallucination. By integrating retrieval and generation processes, RAG leverages external data to enhance the accuracy and contextual relevance of generated content, making it particularly suitable for fields that demand precise and professional information.In this study, AI-generated questions were answered using RAG, and the results were evaluated through combined scores from ChatGPT and human reviewers using metrics such as F1-score and accuracy. The findings indicate that RAG effectively addresses the limitations of traditional LLMs in professional domains, demonstrating outstanding performance in accuracy, relevance, and contextual alignment. Additionally, testset generated using Ragas were employed to provide an objective evaluation of RAG’s performance, further validating its effectiveness. However, the study also revealed areas for improvement in the fidelity of generated responses, particularly in cases of insufficient data support or contextual misalignment.This research confirms that RAG technology significantly improves the quality of text generation by large language models in the field of Library and Information Science. It offers a more accurate and reliable tool for solving professional problems and provides valuable insights for future research and applications in related domains.

並列關鍵字

Retrieval-Augmented Generation ； Large Language Models

參考文獻

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.https://doi.org/10.48550/arXiv.1810.04805

Google Scholar

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.https://doi.org/10.48550/arXiv.2005.14165

Google Scholar

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Riedel, S. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv preprint arXiv:2005.11401.https://doi.org/10.48550/arXiv.2005.11401

Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Google Scholar

Ragas. (n.d.). Ragas documentation (Version stable). Retrieved December 24, 2024, from https://docs.ragas.io/en/stable/

Google Scholar

主題瀏覽