  • 期刊


The Research of Source Code Detecting for Plagiarism Program




In the past, studies focused on the plagiarism of programs. Only a few studies looked for plagiarism sources and plagiarism groups, but these methods were not designed for the field of student plagiarism. This study is based on the papers of plagiarism and copy detection, combining similar assignments into groups, and using the concept of "important fragments" of programs; using the reference of important fragments, the transitivity of important fragments, and the internal similarity and inter-group difference of important fragments located in the group, to calculate the source possibility in the plagiarism group; finally, the weight training mode is used to train the weight of the plagiarized group, and the possibility that the true source is detected is improved. The experimental results show that: (1) plagiarism scores can be sampled from one to five groups, all with good source detection rate. (2) The use of weight training mode can effectively improve the weight score of the real source and reduce the false positive rate of non-source. (3) The three-stage score calculation of important segments can effectively form the difference in scores within the group, making the real source more easily detected. After the plagiarism group and the real source are analyzed, the instructor can further evaluate the ambiguity of plagiarism or plagiarism between the students by plagiarizing the group to match the competing group relationship between the students.


張火燦、劉淑寧,從社會網絡理論探討員工知識分享,人力資源管理學報,第 2 卷第 2 期,2002,頁 101-113。
黃政傑、張嘉育(2010),讓學生成功學習:適性課程與教學之理念與策略,課程與教學季刊,第 3 卷第 13 期,頁 1-22。
Baker, B. S., Parameterized diff, In Proceedings of the 10th ACM-SIAM Symposium on Discrete Algorithms (SODA’99), USA, January 1999, pp. 854-855.
Baxter, I. D., Yahin, A., Moura, L., Sant’Anna, M. & Bier, L., Clone Detection Using Abstract Syntax Trees, 14th IEEE International Conference on Software Maintenance (ICSM'98), March 1998, pp. 368-377.
