本研究利用AI技術設計一個簡易操作的APP,搭配自己開發的伺服器端程式,協助使用者快速辨識手機簡訊是否為詐騙。一般正常簡訊較易收集,而詐騙簡訊較難取得,因此資料集中的詐騙、非詐騙樣本數目會差距懸殊。我們利用搜集到的少量詐騙文本,透過生成對抗網路(Generative Adversarial Network,GAN),使訓練過後的生成器可以生成仿真詐騙文本。而這些生成出來的仿真詐騙文本,就被視為詐騙文本,使資料集中的詐騙、非詐騙樣本數較為平均,以利後續BERT(Bidirectional Encoder Representations from Transformers)模型的微調(fine-tuning)。BERT是一種自然語言處理模型,透過自注意力(self-attention)機制可以有效解讀整個句子的語義並做精準分類。最後建立伺服器讓手機用戶能利用我們設計的APP來使用這個BERT模型,以協助民眾判斷訊息是否為詐騙。
This study leverages AI technology to develop a user-friendly app, complemented by a server-side program we developed, to help users quickly identify fraudulent texts on mobile phones. Normal texts are more readily available, whereas scam texts are scarcer, leading to a significant disparity in the scam and non-scam sample sizes within our dataset. To address this, we use a small collection of scam texts and apply Generative Adversarial Networks (GANs) to train a generator capable of producing realistic scam texts. These artificially generated texts are used to balance the dataset, aiding the fine-tuning of the BERT (Bidirectional Encoder Representations from Transformers) model. BERT, an advanced natural language processing model, interprets the semantics of entire sentences and classifies them with high accuracy using a self-attention mechanism. We have established a server, enabling mobile phone users to utilize our app and the BERT model to determine the authenticity of texts.