透過您的圖書館登入
IP:18.220.106.241
  • 學位論文

從電子郵件網絡推薦潛在朋友

Recommendation of Potential Friends from E-Mail Networks

指導教授 : 洪智力

摘要


近年來許多電子郵件軟體與網頁郵件服務皆開始提供了電子郵件推薦收件者名單的機制, 但這些機制產生的推薦名單中僅有列出往來頻繁的信件收發件者,卻沒有依郵件內容與討論群組之關聯性來推薦收件者,導致推薦收件者名單中包含了不應屬於討論群組的人員,因而降低了推薦正確性。若能改依郵件內容與討論群組進行推薦,將有利於在發送信件時, 發送者能更方便地選擇收件者名單。有鑒於此,本研究提供了一種依電子郵件討論內容區分群組推薦收件者的方法,此方法由三個模組所構成。資料預處理模組從Enron電子郵件資料集中擷取所有電子郵件主旨與內文,經過刪去郵件Header、去除Stopwords與詞性擷取等預先處理步驟取出名詞。推薦模組在計算名詞的熵值後篩選出重要字彙,並以重要字彙作為測量電子郵件之間餘弦相似度的特徵依據,找出與新電子郵件相似的電子郵件群組。接著再分析相似群的電子郵件裡的重要字彙與收發件者,推薦出最相近的收件者名單。研究評估模組則藉由上述分析郵件內容以推薦收件者名單的方法,搭配不同條件進行實驗並對其產生的推薦結果加以驗證,在實驗結果中可證實使用相似電子郵件內容的重要字彙與聯絡人可以推薦出相當準確的收件者名單。

關鍵字

推薦系統 文字探勘

並列摘要


The recipient recommendation has been the feature of email clients and webmail services in recent years. However, the recommended list of these features often only contains frequent contacts. There are no contacts who referred to correlation between the email contents and the discussion group included. These results in recommended recipients include some contacts who should not belong to the related discussion group, and thus it affects the accuracy of the recommendation. In this viewpoint, this paper provides an approach for recipient recommendation which depends on the correlation between the email contents and the discussion group. The approach contains three modules. The preprocessing module retrieves contents and subjects from Enron email dataset central first, and then fetches nouns through some processes such as deleting email header, part-of-speech tagging and removing Stopwords. The recommendation module calculates the entropy of the nouns in each to pick up keywords, and then clusters emails which are similar with new email into a discussion group by cosine similarity measurement. Moreover this research also analyzes the keywords and contacts in discussion group to generate the list of recommended recipients. Finally, this proposed approach is verified by a real email. Obviously, the result show the proposed recommendation approach by the contents and contacts in similar emails is pretty workable.

並列關鍵字

Recommender System Text Mining Entropy

參考文獻


林雅廸, 虞孝成, 孫嘉祈, & 林亭汝. (2009). 應用灰關聯與Entropy探討台灣LED廠商經營績效. [Using Grey Relationship Analysis and Entropy for Exploring the Operation Performance of Taiwan LED Manufacturing Companies]. 臺灣企業績效學刊, 3(1), 73-102.
張宗翰, & 史弼中. (2013). 以Entropy-Based TOPSIS評估台灣食品公司經營績效. [Evaluate the Operational Performance of Food Corporations in Taiwan by Utilizing the Entropy-Based TOPSIS]. 國立虎尾科技大學學報, 31(2), 7-25.
傅家啓, 游雅雯, 林宏茂, & 楊晴雯. (2012). 大腸鏡自動化息肉檢測系統之開發. [Development of an Automatic Polyps Inspection System for Colonoscopic Imaging]. 醫療資訊雜誌, 21(4), 1-13.
Bird, Christian, Gourley, Alex, Devanbu, Prem, Gertz, Michael, & Swaminathan, Anand. (2006). Mining email social networks. Proceedings of the 2006 international workshop on Mining software repositories, Shanghai, China.
Pal, Chris , & McCallum, Andrew. (2006). Cc prediction with graphical models.

延伸閱讀