在多雲端環境下頻繁模式探勘的資料保護

使用第三方資源的資料探勘的隱私保護越來越重要。在網路上傳輸資料時,一些攻擊者可能希望得到敏感資訊,以獲取一些利益。在本論文中,我們專注於頻繁模式探堪的隱私保護。我們引用k-support anonymity合併分類樹的一種技術,並將它轉換成到多雲端環境下的分散式演算法。我們將資料庫切成幾個部分,每個部分都有完整的支持的元素和部分支持的元素。每個雲端都可以計算元素的頻繁模式,並且傳出的資料都必須滿足k-support anonymity,我們將有完整support的元素建成k-support anonymity分類樹,並將不是完整support的元素加入相近的雜訊後放入分類樹。因為頻繁模式必須滿足最小支持度的頻繁模式挖掘的定義。因此,我們將排除沒有滿足最小支持度的元素。只將滿足於各雲端的最小支持度的元素來加入到樹中。各雲端相近support使他們能夠相互掩護而不需要再額外產生更多雜訊。使用多雲端環境中具有一些的優勢。我們送出的數據是部份且不完整的,所以如果攻擊者只是得到一個雲端中的資料,他永遠無法逆推回完整的原始資料。每個雲端都不知道其他雲端的存在,所以我們可以分開做獨立的動作,每個雲,並不需要關心的對他的行動將會影響到其他的雲端。在我們的演算法中,如果資料庫是非常巨大,我們可以將資料拆份並且減少記憶體的用量。

關鍵字

雲端；資訊安全

並列摘要

The privacy of outsourcing data mining become more and more important. When data is transported on the Internet, some assaulter may want to get the sensitive information to earn some profits. In this paper, we focus on the privacy of frequent pattern mining. We refer a technique as called k-support anonymity with taxonomy tree and improve it into Multi-cloud. We segment the database to several parts by sensitive item. Each part has some items with complete support and some items with partial support. Each part can calculate the frequent patterns of items with complete support. For satisfy the k-support anonymity, we build taxonomy tree by the items of complete support and join the noise of items with partial support. Before of the definition of frequent pattern mining which says frequent pattern must satisfy the minimum support. So, we will exclude the item which did not satisfy the minimum support. After exclude the items which support lower than the minimum support of each part, we can decrease the number of noise and capacity of computation. The noise of each part are partial, so they can cover each other who has the nearly support. Using the Multi-cloud environment has some advantages. The data what we send out is partial, so if assaulter just get data of one cloud, he never can reverse the original data. Each cloud does not know about what the other cloud doing, so we can do unique action to each cloud and do not need to care the action will effect the other cloud. In our algorithm, if the database is very big, we can split data and decrease the cost of memory.

並列關鍵字

cloud ； privacy

參考文獻

[1] C. C. Aggarwal and P. S. Yu, On static and dynamic methods for condensation-based privacy-preserving data mining, ACM Transactions on Database Systems,2008.

[3] R. Buyya, C. S. Yeo, and S. Venugopal, Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities, In Proc. of Canadian Society for the Study of Education, pages 10-1016, 2008.

[4] L. Cao, P. S. Yu, C. Zhang, and H. Zhang, Data Mining for Business Applications, Springer, 2008.

[6] F. Giannotti, L. V. Lakshmanan, A. Monreale,D. Pedreschi, and H. Wang, Privacy-preserving mining of association rules from outsourced transaction databases, In Workshop on Security and Privacy in Cloud Computing, 2010.

[7] J. Han and Y. Fu, Discovery of multiple-level association rules from large databases., In Proc. of Very Large Data Base, 1995.

被引用紀錄

吳宜娟（2012）。台灣癌症病患自殺模式之時空分析〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2012.02754

韓宜吟（2012）。探討台灣老人自殺死亡之個人因素與地理變異(1999至2007年)〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2012.01012

黃沛彤（2011）。國中學生自殺意圖及其相關因素之研究-對台灣地區國中學生樣本的分析〔碩士論文，亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-1511201215472419

國際替代計量

在多雲端環境下頻繁模式探勘的資料保護

全文下載

主題瀏覽