由於資訊科技不斷地進步,企業與機構過多的資訊導致數位文件不斷迅速累積,這些資料反映了人類在各專業領域方面的知識累積。根據使用者的歸類文件需求而言,由於企業日誌的多樣性與大量性,依照部門的使用者相關經驗來分類日誌內容相當費時。日誌內容包含許多其他部門的專業術語(例如:SNR、DNS、CM),本論文提出依專業術語進行文件分類,使用者可透過本系統自動化文件分類,可以快速又準確的將日誌歸類到正確的分類中。本研究主要是應用本體論分析找出詞彙深度之間的關係,並據以其他分類方法測試其分類的結果,再配合類神經網路模組將文件資料做自動分類。使用準確度、召回率及F1分別評估分類結果。實驗結果顯示,利用term presence(TP)配合詞彙深度的量測並使用類神經網路進行分類,結果比傳統分類方法跟TF-IDF分類結果有比較好的分類成效。未來將持續蒐集各部門事件之關鍵字彙以提高自動判定之成效。
A lot of digital data have been accumulated in many enterprises and institutions due to the continuing advanced in information technology. These data reflect human being''s accumulation of all knowledge of specialized territories. According to the users'' demands for categorizing documents, users classify those documents depend on their related experiences. Consequently, they categorize the daily record according to every department''s related experiences. However, these daily record contain many technical terms. SNR(Signal to Noise Ratio), DNS(Domain Name System), and CM(Cable Modem) are the examples of technical terms. Therefore, users can classify these daily records into the proper catalogues quickly and correctly by using the system''s document automatic classification. This study analyzes the relationship between the term and it depth by applying the ontology, and then examine the results of classification by using different classification methods. Moreover, using the neural network model to categorize the documents automatically. Using the precision recall, and F1 to evaluate the effects of the classification that cooperated with the neural network. The study represented that the result of using term presence(TP) which cooperated with neural network had more positive classification effectiveness than the trainditional method and TF-IDF. The result of this study can use the techniques of TP to enhance the precision of automatic classification for daily record according to the key term depth. We will collect departments’ even records will be collected to enhance the effectiveness of automatic classification in the future.