透過您的圖書館登入
IP:3.14.246.254
  • 學位論文

運用時間序列分群於社會性標籤之研究

A Study of Applying Time Series Clustering to Social Tagging

指導教授 : 柯皓仁

摘要


由於網際網路的普及化,使得網路的服務與經營型態層出不窮,在Web 2.0興起後,分享與參與式的架構更成為網路服務的主流。社會性標籤是Web 2.0的一項重要服務,社會性標籤讓網路使用者對網路資源進行標記,實現了運用集體力量收藏及分享網路資源的機制。社會性標籤之所以能夠蔚為風行,原因是其背後社交的性質,讓社會大眾透過標籤產生對話與互動。 本研究以社會性標籤的社會層面為出發點,檢視社會性標籤隨時間的變化趨勢,以了解社會脈動。本研究利用時間序列分群演算法,首先收集黑米共享書籤網站裡的標籤根據其所標記之網頁內容轉換為時間序列的形式,找出在同一時間區間擁有相似走勢的標籤群聚,形成主題概念;接著計算不同時間區間所形成之群聚的相似度,以萃取出所有時間區間中擁有相似主題概念的群聚和包含於內的標籤;此外,對於同一標籤,分析其在各時間區間的變化趨勢,以及相關連的標籤和網頁。最後,透過本研究所開發之雛型介面將前述研究成果整合。

並列摘要


Due to the widespread adoption of Internet, a variety of Internet applications emerges. With the booming of Web 2.0, participation and sharing becomes the central concept of Web services. Social tagging is one essential service in the Web 2.0 era, which allows users to label Web resources with keywords thought of by themselves. Social tagging also enables the cooperative acquisition and sharing of Web resources. The key factor that social tagging can catch people’s attention is the “social” character, which facilitates the communication and interaction of people. This study, based on the social character of social tagging, detects the chronological variation of social tagging for understanding social trends. Time series clustering for tags is employed in this study. First, tags collected from HEMiDEMi are transformed into the form of time series according to the Web pages labeled by these tags, and then time series clustering is used to identify the tag clusters with similar chronological patterns and trends in the same time period. Next, the similarity between clusters in all time period is calculated to extract the clusters and associated tags with similar concepts in different time periods. Furthermore, the trend variation of an identical tag in different time period is analyzed, and the related tags and Web pages are discovered. Finally, the above research outcomes are integrated into a prototype system.

參考文獻


[2] A. K. Jain, M. N. Murty, & P. J. Flynn, "Data clustering: A review, " ACM Computing Surveys, vol. 31, pp. 264-323, 1999.
[4] D. Goldin and P. Kanellakis, "On similarity queries for time-series data: Constraint specification and implementation," Principles and Practice of Constraint Programming — CP '95, pp. 137-153, 1995.
[6] E. M. Voorhees, "Implementing agglomerative hierarchical clustering algorithms for use in document retrieval," Information Processing & Management, vol. 22, pp. 465-476, 1986.
[9] Hsi-Cheng Chang and Chiun-Chieh Hsu, "Using topic keyword clusters for automatic document clustering," Information Technology and Applications, 2005. ICITA 2005. Third International Conference on, vol. 1, pp. 419-424 vol.1, 2005.
[12] J. J. Van Wijk and E. R. Van Selow, "Cluster and calendar based visualization of time series data," Information Visualization, 1999. (Info Vis '99) Proceedings. 1999 IEEE Symposium on, pp. 4-9, 140, 1999.

延伸閱讀