中文摘要 本文提供一種 c 程式的分類方法,主要以字串比對系統,對於每一個 c 程式語言,以 F(p)-L 轉換為代數表示式,如此對於每一個程式都將可以視為文字字串,然後再以字串比對系統求出其程式之間的差異度(Edit-Distance),如此對於程式分類的方法歸納程式的類別。 而在字串比對系統方面,本對文將有別於傳統字串比對的運算規則作一些延伸的方法,傳統基本的字串比對方法大致允許 插入(Insert)、刪除(Delete) 和 取代(Replace)等規則,進一步則加入換位(Transposition) 。而本文將就換位此規則作了一些延伸的方法。使之對於再處理 C 程式語言時能夠求得有更好的(或更小) Edit-Distance 結果。 而為了增進對於程式之中使兩個程式間有更好的相似性,對於程式我們將以不更動其流程和語意下,做了小部分的格式統一,並以變數重新命名(Binding),以期望找出程式中最小的差異度。
ABSTRACT This paper propose a classification method for C language code, in string compare system, we use F(p)-L to transfer every C code to an algebraic expression, so that every C program could be view as a text string. Then we can calculate the “Edit-distance” between the programs by string compare algorithm, and use the distance to induce the classification of the programs. About the string compare system, we propose some extend method that differ to the operation rule of traditional string comparison, traditional basic string comparison roughly allow “Insert”, “Delete”, and “Replace”. Further added “Transposition”. This paper extend Transposition rule and try to make better Edit-Distance result performance of C programs. For increase the similarity of two programs, we will unite small part of program format without changing the program process and semantic, and Re-binding the variable, expect to find smallest Edit-distance between the programs.