隨著社群網路(Social Network)的蓬勃發展,個人隱私保護意識的提升和相關法律的制訂,保護社群網路中所隱含的個人隱私已經成為重要且刻不容緩的問題。所以為了防止因研究放出去的社群網路資料會危害到民眾個人隱私,會先做移除有直接相關的隱私資訊(ex. 姓名...etc)的匿名處理。 但攻擊者依舊可透過去匿名化的技術找出其中包含的隱私資訊,如鄰居攻擊等。而為了防止這種狀況會加深匿名化,讓它攻擊後找出的結果個數滿足k-anonymity,但這樣也可能導致對資料過度匿名化讓其資料可用性過低。 因此為了找出兩者中較好的平衡點,本篇提出先對資料做分群再進行匿名化的方法,讓其只處理風險較高的資料並讓其被牽連資料較少,所以只需較少的資料失真便可達到同樣程度的資料匿名化。
Protecting personal privacy information becomes an important issue in on-line social networks. An attacker is able to launch 1-neighborhood attack with local knowledge about individuals (such as the victim’s 1-neighborhood sub-graph, which consisting of the one-hop neighbors and the relationships among neighbors). In order to protect personal relationship privacy while releasing social network data for research, the k-anonymity requirement must be satisfied. However, there still exists a trade-off between re-identification risk and information loss. The aim of this work is to present a cluster-based anonymity method which satisfies the privacy requirement with less data distortion. The experiment result shows that the proposed approach has a higher the k-anonymity privacy protection with low information loss.