  • 學位論文


Improving Detection Performance of Duplicate Bug Reports using Extended Class Centroid Information

指導教授 : 楊正仁


在軟體維護中,錯誤報告在軟體套件的錯誤更正上扮演了非常重要的角色。然而,在許多軟體專案的開發中,往往存在許多重複的錯誤報告,因而造成了軟體維護上需要花費許多人力與時來將重複的錯報告標示出來。在此研究中,我們提出一個類別質心擴充資訊 (ECCI) 的偵測機制來加強重複錯誤報告的偵測效果。透過實際在Apache, ArgoUML 與 SVN 等開放源碼專案上的實驗,我們驗證了ECCI能夠有效地提升偵測重複錯誤報告的準確率。實驗結果並顯示 ECCI 能夠在所有的實驗軟體專案中超越其他的偵測方法,這些不錯的實驗結果呈現我們所提出的類別質心擴充資訊確實能夠提升重複軟體錯誤報告的偵測效果。


In software maintenance, bug reports play an important role for the correctness of software packages. Unfortunately, a duplicate bug report problem arises because there are significantly many duplicate bug reports in various software projects. Processing duplicate bug reports is thus time-consuming and has high cost of software maintenance. In this study, we propose a detection scheme based on the extended class centroid information (ECCI) to enhance the detection performance. The effectiveness of ECCI is verified in an empirical study with three open-source projects, SVN, ArgoUML, and Apache. The experimental results show that ECCI outperforms other detection schemes in all cases.The promising results demonstrate the future practical application of ECCI.


[7] Hung-Hsueh Du, “A Study of Duplication DetectionMethods for Bug Reports based on BM25 Feature Weighting,” Master Thesis, Yuan Ze University, Nov. 2011.
[12] Mostafa Keikha, Narjes Sharif Razavian, Farhad Oroumchian, and Hassan Seyed Razi, “Document Representation and Quality of Text: An Analysis,” in Survey of
[13] Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford, “Okapi at TREC-3,” in Proceedings of the Third Text Retrieval Conference (TREC-3), 1994, pp. 109–126.
[16] Ashish Sureka and Pankaj Jalote, “Detecting Duplicate Bug Report using Character N-Gram-based Features,” in Proceedings of the 17th Asia Pacific Software Engineering Conference (APSEC 2010), 2010, pp. 366–374.
[17] Vincent Tam, Ardi Santoso, and Rudy Setiono, “A Comparative Study of Centroid- Based, Neighborhood-Based and Statistical Approaches for Effective Document Categorization,” in Proceedings of the 16th International Conference on Pattern Recogniton (ICPR’02), vol. 4, 2002, pp. 235–238.


