近幾年來,「假新聞」、「假訊息」等威脅,在資訊戰中已達到國安等級,也成為了許多國家研究的重點。但此議題並非為新興現象,例如,早在2014年俄羅斯介入影響烏克蘭的克里米亞歸屬公投,以及最近的烏俄戰爭中,我們都可以看到不管是俄羅斯或是其餘國家,許多社群媒體帶風向的情況。因此,本論文專注於發布可疑訊息的帳號以及貼文,並利用Twitter官方的計畫網站-「Transparency」網站中,Twitter定義可疑帳號為跟政府或州有關的假訊息操弄帳號,公布經調查確認為可疑帳號以及貼文的資料。有別於以往的識別方式,我們利用機器學習中的「異常偵測」技術,訓練出一個能以高準度分辨出異常訊息以及異常帳號之分辨器。在資料收集方面,我們建立基於ETL框架的資料爬取系統,爬取了名人的官方帳號以及推文。並利用官方已經證實身分之有「藍勾勾」的帳號所發布之正常貼文,來驗證分辨器誤判之情形。從實驗結果,我們發現準確度達到96%,獲得很好的效果。
In recent years, threats such as "fake news" and "disinformation" have reached the level of national security in information warfare, and have become an important research issue. For example, as early as 2014, Russia intervened to influence Ukraine's Crimea referendum, and in the recent Ukrainian-Russian War, we can see that in many communities, whether Russia or the others, the media takes the wind. This article focuses on the accounts and posts that publish suspicious information, and uses Twitter's official project website-Transparency website. Twitter defines suspicious accounts as accounts that manipulate disinformation related to the government or state, and publishes them after investigation and confirmation. Different from the previous identification methods, in this paper we use the "anomaly detection" technology in machine learning to train a classifier that can distinguish abnormal messages and abnormal accounts with high accuracy. For the dataset, we established a data crawling system based on the ETL framework, and crawled official accounts and tweets of celebrities. And use the normal posts posted by the accounts with blue tick, whose identities have been officially confirmed, to verify the performance of the classifier. From the experimental results, we found that the accuracy of our identification method reached 96%.