透過您的圖書館登入
IP:18.217.200.151
  • 學位論文

應用網站探勘技術於網友瀏灠行為分析-以內容服務網站為例

Apply Web Mining Techniques to Analyze the Navigation Behavior of Visitors - Using Online Content Site as Example

指導教授 : 曹承礎

摘要


根據TWNIC 2005年1月中旬之調查報告指出,臺灣地區上網人口成長約1,380萬人,上網的家庭已達463萬戶,普及率高達65.02%,故Internet不僅已成為一個強大的媒體平臺,它也變成每個企業不可或缺的重要通路。企業都思索如何應用此通路來收集客戶瀏灠行為資料及維護客戶關係,找出顧客的真正需求,提昇服務品質和滿意度,以鞏固顧客忠誠度,使企業能長期從客戶價值中獲利。 但是要從網站的記錄來分析客戶瀏灠行為,在實務上還是有許多難題待解: (1) 網站的記錄,資訊是片斷分散的資訊,先天上資料並不夠完整。(2)網站的記錄的量通常都非常大,如何萃取、轉換成有用的資訊都是挑戰。(3)如何挖掘出對企業有用的知識也是難題。 本研究的貢獻是: (1)提出一套實用又簡單的分析架構可以有效的分析網站的記錄,對上述難題提供解決方案,(2)同時還提出一項演算法,它可算出網頁被點閱的熱門程度,進一步繪製出網站二維的瀏灠地圖,從視覺化呈現探勘結果,讓解釋上更明瞭易懂。應用本研究之分析架構和準則,就能從分析網站的記錄中萃取出網友的瀏灠行為,可協助企業多瞭解客戶,獲知客戶喜好有關的知識。 其結果可用於改善: (1)網站結構設計、(2)網頁瀏灠動線設計、(3)也可以用來分析單一客戶、一群特定目標客戶、或是全站客戶點閱記錄。從分析結果中,就能得到熱門網頁及客戶喜好的瀏灠動線,有了這些資訊後,只要再加入網站較高層次的語意資訊,例如:那些網頁是與購物車相關網頁、那些是檢索服務入口網頁等資訊,就可獲得許多網站經營上的實務知識。 本研究並以一個實際在線上服務的內容網站,從2005年3月至4月,一個月期間約六千六百萬筆記錄,共1.26GB大小的原始網站日誌資料為分析對象,經處理後選取其中三個最具代表性的典型資料為範例,來實證本研究所提的分析架構和演算法的適用性。

並列摘要


According to the survey report, issued by TWNIC Jan. 2005, Internet popularity had grown to 13,800,000 users, about 4,630,000 home families, approaching 65% of whole families in Taiwan. Therefore, the Internet not only is a powerful media, but also become an important channel to enterprises. All enterprises are eager to find out a useful way to synergize such a powerful channel. They have been trying to analyze the visiting log of the web, and mine the behavior of customers who had contacted the enterprise through the Internet, willing to collect more customer information and provide more personalized services to customers. However, in practicality, there are some difficulties encountered. The First is the web logs are distributed information, which are separated on several servers, and need to be integrated and do lots of processing. Secondary, one of the difficulties is how to extract the key features from the huge logs, and how to solve the scalability issues. The third problem is how to find the suitable mining tools to discover the implicit knowledge from bunch of irrelevant raw data. Our research proposes a novel framework, which integrates most useful public domain resources and some self-developed tools, provides powerful analyzing tools to overcome such difficulties. This thesis also illustrates a novel algorithm to visualize click-stream mining result, named “Click-map”. This presentation is able to assist the web master to discover users’ navigation behaviors from the click path analysis more easily. For examining the availability of the framework and analysis methods, we use online web logs for the period of one month as examples. The logs came from an online content search services site, with 1.26GB data size and over 66 million records, recorded from March to April in 2005. The results proofed our framework to be useful and effective.

參考文獻


趙景明(2003),趙景明、黃雅慧 “應用網頁探勘於網站瀏灠之個人化-以醫療產業為例”, 中原學報,第31卷,第3期,pp. 271-282,2003年
[Agrawal 1994] R. Agrawal, and R. Srikant, “Fast Algorithms for mining association rules.“, VLDB-94, 1994.
[Araya 2004] S. Araya, M. Silva, and R. Weber, “A methodology for web usage mining and its application to target group identification.“, Fuzzy Sets and System, Vol., 148, pp.139-152, 2004.
[Berendt 2000] B. Berendt, and M. Spiliopoulou,"Analysis of navigation behaviour in web sites integrating multiple information systems",The VLDB Journal (2000) 9: 56–75.
[Berendt 2002a] B. Berendt, B. Mobasher, M. Nakagawa and M. Spiliopoulou. "The Impact of Site Structure and User Environment on Session Reconstruction in Web Usage Mining", WEBKDD 2002, pp 159-179, 2002.

被引用紀錄


胡少杰(2011)。混合式項目推薦排序演算法於健康知識網站之應用〔碩士論文,臺北醫學大學〕。華藝線上圖書館。https://doi.org/10.6831/TMU.2011.00115
吳啟鳴(2013)。基於書目關係與使用者瀏覽路徑之網路書店連結推薦〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2013.02314
蕭芳祥(2008)。平面媒體服務創新–以互動報紙為例〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2008.00912
洪范文(2010)。以網站日誌探勘建立網站架構〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315203559

延伸閱讀