以使用目的為導向之叢集匿名化的隱私保護技術

隨著資訊與通訊科技(Information and Communications Technology)的蓬勃發展，越來越多服務透過電子化資料的方式來提高服務效率和品質，這些資料中包含了許多可以直接或間接辨識出資料對象本身的隱私資訊，隨著個人保護隱私的意識高漲和法律規範，資料提供者在使用或共享資料前都必須先進行去識別化(De-identification)處理讓資料達到k-匿名(k-Anonymity)隱私規範來防止隱私揭露(Disclosure)發生。綜觀目前的文獻研究已經存在許多滿足k-匿名的去識別化方法，這些方法將所有資料屬性視為具有相同的重要性，並未考量到實際使用資料時不同的使用需求會需要對特定關鍵屬性保留較高資料真實性的問題；因此，本篇論文提出以使用目的為導向的隱私保護技術- 「UACT」，依照使用者定義的關鍵屬性(Critical attribute)和準識別符(Quasi-identifiers)產生一組屬性序列(Attribute sequence)，依照屬性序列進行排序後以資料叢集為基礎進行泛化(Generalization)處理。最後經實驗結果證明以不同的目的與對應的關鍵屬性進行泛化處理後，UACT不僅能在關鍵屬性的資料維持較高真實性，而且依然具有較低的資料失真和較快的執行效率。

關鍵字

使用導向；隱私保護； k-匿名；叢集；泛化

並列摘要

Digitizing data has emerged as a trend underpinned by information and communication technology (ICT) to utilize the data more effectively, and improve the service efficiency and quality. However, these digital data contains highly confidential and involves privacy information; even legitimate access can cause privacy infringements. With the rising awareness of personal privacy and legal regulations, data provider needs to de-identify data and achieve k-anonymity requirements before using and sharing data. Current k-anonymity mechanisms deem that all the attributes have the same importance and disregard the data practical needs, which will decrease the utility of the anonymous data for a particular usage. In this study, we propose a utility-based k-anonymity mechanism – UACT, which sorts and generalizes data according to user-predefined attribute priority in order to satisfy the k-anonymity requirement and data utility needs. The experiments show that UACT not only satisfy the k-anonymity requirement and data utility needs but also maintains low information loss and high execution efficiency.

並列關鍵字

utility-based ； k-Anonymity ； privacy-preserving ； clustering technique ； generalization

參考文獻

[3] T. Kuipers and J. van der Hoeven, “Insight into digital preservation of research output in Europe,” 2009, .

[5] Y. Fu, Z.Y. Chen, G. Koru and A. Gangopadhyay, “A privacy protection technique for publishing data mining models and research data,” ACM Transactions on Management, Information Systems, 1(1), pp. 7:1-7:20, 2010.

[6] L. Sweeney, “K-anonymity: a model for protecting privacy,” International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems, 10(5), pp.571–588, 2002.

[7] P. Samarati, “Protecting respondents' identities in microdata release,” IEEE Transactions on Knowledge and Data Engineering, pp.1010-27, 2001.

[9] P. Gunn, A. Fremont, M. Bottrell, L. Shugarman, J. Galegher and T. Bikson, “The Health Insurance Portability and Accountability Act privacy rule: a practical guide for researchers,” Medical Care, 42(4), pp.321-327, 2004.

國際替代計量

以使用目的為導向之叢集匿名化的隱私保護技術

未授權

主題瀏覽