文字生成技術應用於學術論文寫作之評估─以人工智慧領域論文摘要為例

文字生成技術的應用在近年愈臻成熟，其對學術產出過程的影響更是不容小覷。為初步瞭解此技術對學術研究發表的影響，並探索人類與電腦能否辨別電腦生成或人類撰寫之學術文章，本研究運用既有的開放資源，以人工智慧領域之論文摘要為範圍，進行了「人類評估電腦生成摘要」及「摘要生成模型自動化評估」兩實驗。實驗一依據ACL Anthology和arXiv（cs.AI）語料，以語言模型GPT-2生成論文摘要，再就英文文法檢查工具Grammarly和受試者對其之評估情形進行分析。實驗二則藉由分類器，實測電腦能否辨別出電腦生成之摘要，再與受試者的評估結果進行比較。研究結論如下： 1. 電腦能生成仿真度高的摘要，並在Grammarly的評估指標表現較人類撰寫摘要佳。 2. 受試者對於電腦生成摘要之平均良窳度給分為3.617，而人類撰寫摘要則為3.622，顯示人類在不知道有電腦參與生成的前提下，無法明顯地辨別出一篇摘要為電腦生成或人類撰寫。 3. 以SciBERT預測30篇摘要之Micro和Macro f1皆為0.93，較受試者的0.53及0.44高上許多，顯示電腦具辨別電腦生成摘要之能力。同時，由於在SciBERT預測錯誤的2篇摘要中，有1篇在人類預測中為正確，推論電腦與人類或許能在辨別上相互輔助。

關鍵字

人工智慧；深度學習；自然語言生成；文字生成；學術論文寫作

並列摘要

The application of text generation has become well developed in recent years and is having a growing impact on the process of academic production. In order to explore its influences of it on academic publications and whether humans and computers can distinguish the differences between computer-generated and human-written academic articles. This study used abstracts in the field of artificial intelligence to conduct two experiments. In the first experiment, we generated abstracts by GPT-2, which were fine-tuned with corpora from ACL Anthology and arXiv. Then we analyzed and evaluated the abstracts by both Grammarly and humans. In the second experiment, we used classifiers to test whether the computer could identify the computer-generated abstracts. Finally, we compared them with the human evaluation results. The conclusions are as follows: 1. The computer can generate high-quality abstracts. 2. The mean score of computer-generated abstracts was 3.617, while the mean score of human-written abstracts was 3.622, indicating that humans could not distinguish whether an abstract was computer-generated or human-written. 3. The Micro and Macro f1 of the abstract prediction by SciBERT were both 0.93, which are higher than the prediction by humans (0.53 and 0.44), indicating that the computer has the ability to discriminate computer-generated abstracts.

並列關鍵字

Artificial Intelligence ； Deep Learning ； Natural Language Generation ； Text Generation ； Academic Writing

參考文獻

李興昌（2019）。科技論文的規範表達：寫作與編輯。台北：崧燁出版。

Google Scholar

莊道名（1995年12月）。圖書館學與資訊科學大辭典-摘要標準【國家教育研究院雙語詞彙、學術名詞暨辭書資訊網】。取自：http://terms.naer.edu.tw/detail/1680297/

Google Scholar

曾元顯（2012年10月）。圖書館學與資訊科學大辭典-F度量【國家教育研究院雙語詞彙、學術名詞暨辭書資訊網】。取自：http://terms.naer.edu.tw/detail/1679003/

Google Scholar

曾元顯、林郁綺（2021）。電腦生成的新聞有多真？─文字自動生成技術運用於經濟新聞的評估。圖書資訊學刊，19(1)，43-65。

Google Scholar

鍾靜美、羅姿玉、陳盈年、鄭淑曼、何舒涵、葉綾、鍾瑞珈（2015）。英文期刊論文摘要分析。明新學報，41(1)，49-64。

Google Scholar

主題瀏覽