透過您的圖書館登入
IP:18.117.183.172
  • 學位論文

擾動模型在類別資料的隱私分析

Privacy Analysis of Perturbation Models in Categorical Data

指導教授 : 吳建華

摘要


全民健康保險研究資料庫(National Health Insurance Data ,NHIRD),裡面含有台灣 人民的就醫紀錄。在台灣,研究人員是可以運用健保資料庫來研究醫學數據的。在發 表文章前,研究人員必須運用去識別化和加密標識的技術來保障病人的隱私。但是如 果只有去識別化和加密標識並不足以保障病人的隱私,舉例來說,有一個眾所皆知的 名人,其實駭客只要透過一些比較明顯的個人特徵還是能輕易地堆斷出名人的就醫紀 錄。 因此本文的研究目的是透過對模型的擾動和重建,達到限制披露風險,同時也要 盡可能保留較多資料的內容和結構,讓資料能在被保護的情況下,進行統計分析。 在本文中,擾動方法為添加噪音,重建的方法為對組別和類別進行轉換,再比較 轉換前和轉換後的差異後,發現類別交換比例、組別交換比例和添加噪音的多寡是影 響檢定力重要的因素,要讓資料的檢定力越強,就應該要保留越多的原始資料。

並列摘要


National Health Insurance Data (NHIRD), it contains medical records of Taiwanese people. Researchers can use NHIRD to study medical data in Taiwan. Before they publishing the article,Researchers must use anonymization and pseudonymization to protect patient privacy. But if only anonymization and pseudonymizationto are not enough to protect the patient's privacy.For example, there is a well-known celebrity,In fact, as long as hackers use some obvious personal characteristics, they can still easily inference the medical records of celebrities. Therefore, the purpose of this article is to limit the disclosure risk by perturbing and rebuild the model.. At the same time, it is necessary to retain as much content and structure of the data as possible. Allowing data to be statistically analyzed while being protected. This article, perturbation method is add noise, method of rebuild is exchange groups and categories.After compare the difference before and after conversion, It was found that the exchange ratio of categories and groups and the amount of added noise are important factors affecting the power.

參考文獻


[1] Peter Kooiman,Jose Gouweleeuw(1997).PRAM: a method for disclosure limitation of microdata.
[2] Peter-Paul de Wolf(1998).Risk,Utility and PRAM.
[3] Leon Willenborg, Ton de Waal(2001).Elements of Statistical Disclosure Control.
[4] Gerd Ronning(2005).Randomized response and the binary probit model.
[5] Ardo van den Hout,Elsayed Elamir(2006).Statistical Disclosure Control Using Post

延伸閱讀