透過您的圖書館登入
IP:3.149.239.110
  • 期刊

電腦生成的新聞有多真?-文字自動生成技術運用於經濟新聞的評估

How Genuine is Computer-generated News? - Evaluation of Automated Text Generation Applied to Economic News

摘要


本研究以經濟新聞為範圍,探討GPT-2模型,在經過約30萬篇新聞訓練後產生15篇電腦生成之新聞(CGN),混合15篇人類撰寫之新聞(HCN),由12位受試者進行1到5分的可信度評價。結果在15篇HCN中,有1篇其平均可信度為2.92,不及3,原因為沒有邏輯、主觀性強等;在15篇CGN中,有2篇其平均可信度皆為3.33,大於3,原因為內容合理、細節符合邏輯。此2篇的部分內容與語料庫比對後,發現電腦移花接木再加潤飾的能力,已可欺騙專業人士。本研究也訓練BERT模型,以瞭解自動偵測電腦生成新聞之可能性。結果上述30篇新聞中,BERT只有2篇CGN預測錯誤,其餘皆正確,比受試者們集體的預測,有5篇錯誤,成效還要高。較大規模的實驗,顯示BERT的成效,可達0.96的F1分數。

並列摘要


This research explores the GPT-2 deep learning model for economic news generation and evaluation. After training GPT-2 by about 300,000 pieces of news with a total of 150 million words, 15 news articles are generated by GPT-2. Together with 15 real news articles written by journalists, 12 subjects were invited to judge the credibility of the 30 news articles with 1 to 5 scales. As a result, 8 subjects who graduated from economic-related major were more capable of discriminating the human-composed news (HCN) from the computer-generated news (CGN); while 4 subjects who graduated from non-economic related major had poor discriminating ability, and one was even unable to tell the HCN from the CGN. Among the 15 HCN articles, 1 was rated as non-genuine news, with an average credibility of 2.92, which is less than 3, due to lack of logic and strong subjectivity. Among the 15 CGN articles, 2 were rated as genuine news, with average credibility of 3.33, which is greater than 3, because the content is reasonable and the details are logical. After comparing these two articles with the corpus, it is found that the computer's ability to substitute and retouch can deceive professionals. However, most of the CGN articles have been spotted, mainly because of obvious flaws in facts and incorrect digits such as dates and stock codes. The research also explores the possibility of automatically detecting computer-generated news using BERT-based neural network model. As a result, BERT had only 2 false predictions out of the above 30 news articles. Compared with the collective prediction by the 12 subjects with 5 errors, BERT performs better. Further large-scale experiments show that the effectiveness of BERT can reach an F-score of 0.96.

參考文獻


楊德倫、曾元顯(2020)。建置與評估文字自動生成的情感對話系統。教育資料與圖書館學,57(3),355-378。doi: 10.6120/JoEMLS.202011_57(3).0048.RS.CM【Yang, T.-L., & Tseng, Y.-H. (2020). Development and evaluation of emotional conversation system based on automated text generation. Journal of Educational Media & Library Sciences, 57(3), 355-378. doi: 10.6120/JoEMLS.202011_57(3).0048.RS.CM (in Chinese)】
賴志遠(2018)。國際人工智慧政策推動現況。檢自https://portal.stpi.narl.org.tw/index/article/10418【[Lai, Zhi-Yuan] (2018). [Guo ji rengongzhi huizheng cetui dongxian kuang]. Retrieved from https://portal.stpi.narl.org.tw/index/article/10418 (in Chinese)】
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236. doi: 10.1257/jep.31.2.211
Conroy, N. K., Rubin, V. L., & Chen, Y. (2015). Automatic Deception Detection: Methods for Finding Fake News. In Proceedings of the Association for Information Science and Technology (pp. 1-4). St. Louis, MO: Association for Information Science and Technology. doi: 10.1002/pra2.2015.145052010082
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. doi: 10.1162/neco.1997.9.8.1735

延伸閱讀