我們在本篇論文中,將探討網頁使用者習性探勘中,資料前置處理的一些實務上碰到的問題以及解決的方法。 由於在網頁使用者習性探勘的過程中,若未先做好網頁結構分析,則不能確實完成資料前置處理的工作,進而嚴重影響到模式發掘的正確性。 因此,在本論文中,我們應用隨機過程時間派翠網路(Stochastic Timed Petri Nets, STPN)的可到達行為特性(reachability)以及網頁架構經過分析後產生的資料結構,來協助資料前置處理過程中的網頁內容範圍辨識以及路徑填補。
Data preprocessing is an important procedure in web usage mining. In this paper, we will discuss some major questions in data preprocessing, and then provide some methods to help to solve these problems. In a web usage mining process, if we do not complete the web structure analysis at first, then we cannot truly complete data preprocessing, as well seriously affects the accuracy in pattern discovery. Therefore, in the present paper, we utilize Stochastic Timed Petri Nets (STPN) and its reachability behavior characteristic, as well as the constructed web structure which produces after the web structure analysis, to help web content scope recognization and path completion procedure.