透過您的圖書館登入
IP:18.117.183.150
  • 學位論文

非監督式學習預測軟體缺陷

Software Defect Prediction using Unsupervised methods

指導教授 : 袁賢銘

摘要


軟體缺陷預測的主要目標在於透過機器學習的技巧評估軟體的品質,而 Zhang 等人[1] 提出可以將基於連接性質的譜分群演算法應用於軟體缺陷預測的方法,他們的實驗結果顯示出譜分群演算法相較於監督式學習模型可以有相同甚至更好的預測能力。 本論文受到他們的論文所啟發並嘗試著想要透過譜分群演算法重現出相同的實驗結果,然而本篇論文的實驗結果卻跟他們的結果有很大一段的差距,因此詳盡的重現資訊與過程被記錄在本篇論文中。另外,本論文也應用並檢驗了三個群體偵測演算法於三個選擇的數據集「AEEEM」、「NASA」、「PROMISE」,就我們所知,這三個演算法是第一次被應用於軟體缺陷預測並評估在這個領域的預測能力。同時,我們也使用了 不同的相鄰矩陣與四個特徵選擇方法應用於這三個數據集來找出一個最佳的預測組合,並將整個實驗程序設計成一個輕量的框架,供未來其他研究者可以容易的重現本論文的實驗結果。

並列摘要


Software defect prediction aims to assess software quality by using machine learning techniques. Zhang et al. [1] proposed to apply connectivity-­based unsupervised learning method, which is spectral clustering. In their results, they got impressive performances comparing with supervised learning models. This paper is inspired by their work and focusing on replicating the experiment using spectral clustering done by Zhang et al. [1]. However, there is a huge gap in terms of AUC comparing to their results. The exhaustive experiment steps are recorded in this paper. Additionally, this paper examined three community detect methods on the selected datasets: AEEEM, NASA, and PROMISE. To the best of our knowledge, these methods are first applied in SDP to evaluate their predictive power. Also, this paper use another adjacency matrix, two feature selection methods and two feature reduction methods to find a best combination which has the best performance on these datasets. To make replicating our work easy, a lightweight framework is therefore designed and released for future investigation.

參考文獻


[1] F. Zhang et al. “Cross­Project Defect Prediction Using a Connectivity­Based Unsupervised Classifier”. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 2016, pp. 309–320. DOI: 10.1145/2884781.2884839.
[4] K. Kawata et al. “Improving Relevancy Filter Methods for Cross­Project Defect Prediction”. In: 2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence. 2015, pp. 2–7. DOI: 10.1109/ACIT-CSI.2015.104.
[5] Burak Turhan et al. “On the relative value of cross­company and within­company data for defect prediction”. In: Empirical Software Engineering 14.5 (2009), pp. 540–578. ISSN: 1573­7616. DOI: 10.1007/s10664-008-9103-7. URL: https://doi.org/10.1007/s10664-008-9103-7.
[6] F. Peters et al. “Better cross company defect prediction”. In: 2013 10th Working Conference on Mining Software Repositories (MSR). 2013, pp. 409–418. DOI: 10.1109/MSR.2013.6624057.
[7] N. Bettenburg et al. “Think locally, act globally: Improving defect and effort prediction models”. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). 2012, pp. 60–69. DOI: 10.1109/MSR.2012.6224300.

延伸閱讀