透過您的圖書館登入
IP:18.226.251.68
  • 學位論文

利用機器學習技術預測廣告流量欺詐之研究

Predicting Advertisement Fraud by Machine Learning

指導教授 : 李勇昇
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


隨著移動裝置盛行,在移動裝置上投放的廣告量也隨之增加,而欺詐風險無處不在,根據統計,點擊廣告的流量有九成是具有詐欺性的,廣告渠道只需點擊廣告即可提高成本,但對於在網路上投放廣告的公司而言,點擊欺詐的狀況可能會以壓倒性的數量發生,從而導致點擊流量上升和浪費大量廣告費用。本研究以機器學習梯度提升樹演算法(XGBOOST)探討移動裝置的使用者對於移動裝置上的廣告點擊流量是否為欺詐行為。 本研究使用 Feature Engineering 來增加資料維度並從中獲得更多的資訊以利增加模型的正確率,將原始8個特徵擴展至45個特徵進行訓練,並使用XGBoost機器學習演算法來訓練建模,在評估預測結果方面隨機選擇兩組實驗組,考慮 Accuracy、Precision、Recall、F1.Score、MCC,五項具代表性的指標,最終所建立的模型預測分類正確率高達99.95%。

並列摘要


With mobile devices becomes popular , the amount of ads on mobile devices has also increased, and the risk of fraud is all around. According to the statistics, there is the 90% of the traffic of click-through ads is fraudulent, and the advertisement channel only must click on the advertisement then to enhance the cost. But for companies that advertise on the web, click fraud can happen in an overwhelming amount, Thus causes the click current capacity rise and the waste massive advertising cost. This study uses machine learning gradient lifting tree algorithm (XGBOOST) to explore whether the user of the mobile device is fraudulent in the click traffic of the advertisement on the mobile device. In addition, also machine learning must first be pre-processed data, the selection of features facilitated training model. However, since the ad click data has fewer fields, this study uses Feature Engineering to increase the data dimension and get more information to increase the accuracy of the model. Extend the original 8 features to 45 features for training and use XGBoost machine learning algorithms to train modeling. In evaluating the prediction results randomly selected two experimental groups. consider Accuracy, Precision, Recall, F1 Score, MCC, five representative target, established model to predict the final classification accuracy rate of up to 99.95%

參考文獻


1. 愛奇藝(2016)。愛奇藝建成全球領先流量防刷系統維護公平公正內容生態。2019年3月20日,取自 http://www.iqiyi.com/common/20161008/b45560761546c9f8.html
2. Association of National Advertisers(2017)。The Bot Baseline: Fraud in Digital Advertising 2017 Report。2019年3月20日,取自
https://www.ana.net/content/show/id/botfraud.2017?mod=article_inline
3. Eduard Kovacs(2018)。Sophisticated '3ve' Ad Fraud Scheme Dismantled, Operators Indicted。2019年3月20日,取自
https://www.securityweek.com/sophisticated.3ve.ad.fraud.scheme.dismantled.operators.indicted

延伸閱讀