透過您的圖書館登入
IP:18.116.230.40
  • 學位論文

寫作風格相似性之度量

Measurement on Writing Style Similarity

指導教授 : 陳宜欣

摘要


為辨識出社群網路中利用假意見來攻擊對手的匿名帳號,我們提出了一個度量寫作風格相似度的方法,以找出社群網路中帳號與帳號間的潛在關係及真實身分。我們從學術論文與多語網路論壇上的文字資料擷取出各種特徵並轉換為可代表使用者寫作風格的向量,將使用者成對比較後算出其相似度,最後採用基於信度的機制,利用使用者數量在各階段的擴增及寫作特徵的過濾來維持整體方法的精確度。實驗結果顯示了我們的方法在寫作特徵選擇上的正確性,以及在使用者數量擴增時精確度僅有微幅的降低。

並列摘要


In this paper, we introduced a measurement on writing style similarity to determine the potential relationships between anonymous users who are usually used to give fake opinions and attack on their opposites. By English academic writings and text data crawled from multi-languages online forum, our approach extracted various signatures and form representative vectors to describe the writing style of every user, and do pairwise comparison on their similarities.

並列關鍵字

writing style social network sock puppet text mining

參考文獻


[2] J. Binongo. Who wrote the 15th book of Oz? An application of multivariate analysis to authorship attribution. Chance, pages 9–17, 2003.
[3] Z. Bu, Z. Xia, and J. Wang. A sock puppet detection algorithm on virtual spaces. Knowledge-Based Systems, 37:366–377, 2013.
[4] F. Chierichetti, R. Kumar, S. Pandey, and S. Vassilvitskii. Finding the jaccard me- dian. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 293–311. Society for Industrial and Applied Mathematics, 2010.
[8] P. Juola. Authorship Attribution. Foundations and Trends in Information Retrieval, 1(3):233–334, 2007.
[11] C. Martindale and D. McKenzie. On the utility of content analysis in author attribu- tion: The Federalist. Computers and the Humanities, (t 964):259–270, 1995.

延伸閱讀


國際替代計量