以鏈結為基礎的網站行為研究

隨著網際網路的快速發展以及網路使用者的年齡層不斷下降，網路上的不當資訊越來越容易在家長不注意的時候，對心智尚未成熟的使用者造成不良的影響；但網際網路跨越國界的特性、節點分佈的廣闊，也使得政府機構對於規範網路上流通的資訊使不上力，唯有依靠民間企業或團體來發展過濾的機制。現今市面上的網路內容過濾軟體，其所用以判斷不當內容網站的機制多為關鍵字比對(Keyword Comparison)或者內容分析(Content Analysis)，此兩種技術皆已發展多年，達到成熟的階段，但文字基礎的分析方法容易在不同文化背景下遭遇到障礙，是值得注意的。本篇論文試著從另一個角度出發來作特定主題網站的判斷與收集，我們著眼於當文件在超鏈結環境(網際網路)中所展現出的新特質，藉由觀察特定主題網站的行為，利用並分析鏈結結構所帶給我們對於網站的資訊，嘗試發展一個適用於收集和判斷特定類型網站的演算法。

關鍵字

鏈結；網站；網址

並列摘要

The prospering of World Wide Web has brought some unexpected social problems, one of which is the influx of material not suitable for children, such as pornography and hate groups. How to shield impressionable minds from such pollution has become a challenge for computer scientists. One common approach is to build a content filtering tool that block websites containing improper information from being transmitted to the browser. Most content filtering software use keyword comparison or content analysis to identify such websites. Although these methods are effective to some extent, there are still some drawbacks. For instance, same words may represent different concepts under different cultures could lead to misdetection. When applying a pure textual based mechanism on different cultural environments for developing web site analysis algorithms, blocking sites by mistake or fail to block intended sites is a critical and crucial issue. In this thesis, we propose a new approach to website analysis. Our method is based on the observation that related websites tend to refer to each other through hyperlinks. A graph-based algorithm that utilizes this property has been designed and implemented. We have shown that our algorithm is efficient and effective in finding related site by collecting porno-sites together as an example. Additional experiments conducted on butterfly-related websites and gun-related websites have also produced satisfactory results.

並列關鍵字

link ； URL ； hyper-link

參考文獻

[11] Soumen Chakrabarti. Recent Results in Automatic Web Resource Discovery. ACM Computing Surveys, 31(4), December 1999.

[13] Giuseppe Attardi, Antonio Gulli and Fabrizio Sebastiani. Automatic Web Page Categorization by Link and Context Analysis. in THAI-99, 1st European Symposium on Telematics, Hypermedia and Artificial Intelligence, C. Hutchison and G. Lanzarone, Eds., pages: 105-119, 1999.

[19] Google Search Engine http://www.google.com

[1]『亞洲互動網』http://www.ain.com.tw/ain/ainfilter.php

Google Scholar

[2]『久晉資訊』http://www.noporn.com.tw

Google Scholar

被引用紀錄

陳榮佐（2006）。網路位置為基礎的網站數量評估機制 -以色情網站為例〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2006.01938

國際替代計量

以鏈結為基礎的網站行為研究

全文下載

主題瀏覽