透過您的圖書館登入
IP:13.58.244.216
  • 學位論文

統計資料公開控制的隱私維護技術

Privacy-Preserving Techniques for Statistical Disclosure Control

指導教授 : 尹邦嚴

摘要


資訊科技的發達使得收集大量資料不再是件難事,然而這項利器卻引發了悠關資料可用性及保護個人隱私的兩難爭議。針對公開的統計資料庫來說,欲有效地保護微資料與表格資料當中的隱私資訊,除了政府制定相關法律保障之外,可以使用統計公開控制(Statistical Disclosure Control, SDC)技術直接對資料本身進行擾亂處理以達到保護效果。本研究對於此兩類問題的SDC方法分別介紹,並且就微資料保護的「微聚合」方法與表格資料保護的「表格控制調整(Controlled Tabular Adjustment, CTA)」方法深入探討,提出兩個以傳統分群法改良而成的微聚合法,以及一個新的CTA數學模型,稱為「重要性與安全性模型(Importance and security model, ISM)」。ISM是延續著CTA方法的精神,針對微資料保護問題設計而成的數學模型。以往,在微資料與表格資料兩大統計資料隱私維護的議題中,各自發展其相應的SDC方法與相關研究,因問題特性上的不同,兩者的方法並不通用,如今本研究首度在此兩大保護議題之間做一關連性的接合,重新思考隱私保護的情境,將原先用在表格資料保護的CTA方法用於保護微資料,並透過禁制搜尋演算法與加入進階搜尋策略做最佳化求解,以期在最大資訊損失容忍量之約範下,達到最佳化的微資料保護結果。

並列摘要


The rapid development of information technology has made vast data collection easier. However, it also incurs the dilemma between the data utility and the protection of individual privacy. To effectively protect the privacy of microdata and tabular data stored in public statistical databases, statistical disclosure control (SDC) techniques can be applied to perturb the data to achieve protection in addition to making related laws by government. This paper introduces the SDC techniques in the two protection issues, and focuses on microaggregation and controlled tabular adjustment (CTA) techniques. First, we develop two microaggregation methods based on classical clustering methods. Second, a new CTA mathematical formulation called importance and security model (ISM) is proposed. ISM is designed for microdata protection. In the two noted issues, traditional SDC methods and researches are developed separately. They are applied under different scenarios and can not be generalized. This paper is the first attempt that connects the two scenarios. As such the new CTA formulation can be used to protect microdata. Finally, tabu search algorithm and some advanced search strategies are employed to solve the optimization problem which seeks to achieve the best protection of microdata while satisfying the indicated information loss upper bound.

參考文獻


朱家賢 (民97)。一個新的全域最佳化演算法:禁制搜尋加強式粒子群最佳化。國立暨南國際大學資訊管理學系碩士論文,南投縣。
Brand, R., Domingo-Ferrer, J., & Mateo-Sanz, J.M. (2002). Reference data sets to test and compare sdc methods for protection of numerical microdata. European Project IST-2000-25069 CASC, http://neon.vb.cbs.nl/casc.
Clerc, M. (1999). The swarm and the queen: towards a deterministic and adaptive particle swarm optimization. Proceedings of 1999 Congress on Evolutionary Computation, Washington.
Cox, L.H., Kelly, J.P., & Patil, R.J. (2005). Computational Aspects of Controlled Tabular Adjustment: Algorithm and Analysis. In B. Golden, S. Raghavan & E. Wasil (Eds.), The Next Wave in Computer, Optimization and Decision Technologies (pp. 45-59). Boston: Kluwer.
Dandekar, R.A., & Cox, L.H. (2002). Synthetic Tabular Data- An Alternative to Complementary Cell Suppression. Unpublished manuscript.

延伸閱讀