  • 學位論文


An Android Behavior-Based Malware Detection using Machine Learning

指導教授 : 孫宏民


近年來,,智慧型手機的使用率逐漸上升,其技術也越來越成熟,智慧型裝置 提供多樣的功能,讓使用者的生活越來越便利。根據調查,使用Android 系統手 機的比率為84%,這代表全世界有八成以上的智慧型裝置使用者是使用Android 系統。Android 系統的普及吸引了眾多有興趣的開發者,他們可以自行撰寫不同 功能而且有創意的應用程式,也可以設計出一些惡意軟體, 偽裝成一般應用程式, 但卻在背後執行惡意行為。手機上安裝的惡意軟體很可能會偷取使用者的一些個 人隱私資料,像是手機號碼、信用卡帳號等,也可能造成使用者的財物損失,因 此如何偵測手機安全也成了很大的一門議題。 過去行動裝置上的惡意軟體分析最常用的檢測方式為signature-based detection,是以特徵碼比對的方式來檢測,但隨著Android 開發者越多技術越來越發達,惡意軟體發展的數量也大量增加,傳統的signature-based detection 已經越來越趕不上惡意軟體發展的技術,因此本論文透過behavior-based detection 結合machine learning 的方式來偵測惡意軟體。本系統改良了Droidbox 的不足,加入自訂的可辨識應用程式當下執行UI 介面的自動觸擊程式,希望可以有效的觸發惡意軟體,紀錄背後所產生的行為並結合網路行為, 讀寫順序等等。透過宣告的權限作為判斷的輔助,利用machine learning 去判斷是否為惡意軟體,以降低惡意軟體危害使用者的機率。我們也取得了大量的惡意程式樣本以及正常的app 來進行實驗,並且驗證此方法的效果。


In recent years, smart phones become very popular. Lots of people use smart phones instead of traditional phones and almost everyone has one. More and more functional mobile applications are released and it makes user’s life more convenient. The popularity of Android attracts many developers to build not only useful and creative applications, but also some malicious software. Malware installed in user’s smart phone probably not merely steal some privacy information such as phone number, IMEI, credit card number among other but cause some property loss. Therefore, how to detect malicious software on smart phone has become a big issue. In the past, signature-based detection is the most common method to detect malicious software on smart phone. However, the spread of infected malware is faster than researches. Signature-based detection is no longer an effective detect method. In this thesis, we propose An Android behavior-based Malware detection method using Machine learning. We improve an Android application sandbox, Droidbox by inserting a view-identification automatic trigger program which can click mobile applications more effectively. In addition, we collect the behavior such as network activities, file read/write and permission as the feature data and use different machine learning algorithms to classify malware and evaluate the performance. We use a large number of malware and normal application samples to prove that accuracy of our method is pretty high .




