動態資料庫之即時防護資料探勘技術

隨著資訊科技的不斷地蓬勃發展，各種新穎且優良的資料探勘技術陸續被發展，卻延伸出資料安全性問題，引發企業潛在知識與隱私外露的風險。過去所提出資料探勘防護技術，大多以針對既有的靜態資料庫提出各種資料防護技術研究，例如：資料分佈、資料修改等方法。但相較大型的動態資料庫具有資料快速流入、資料量龐大、異動頻繁及使用者需要即時回應等特性。若針對既有已存在的在提出給第三單位時先進行資料防護處理，而當資料量再次增加時，若沒有再經過一次資料防護處理，而當資料被不肖人員所竊取時，可能其取得的資料是未被進行防護處理的，而導致隱私資料或是企業重要知識還是被取得，使企業的生存與發展受到極大的威脅，這對資料庫安全上是一個防護漏洞。然而，關於動態資料庫提供即時性防護之安全性研究，目前鮮少有相關的研究提出。所以，本研究針對動態資料庫潛在的資訊安全問題提出即時防護資料探勘技術，此方法主要是利用決策樹工具取得資料表的關鍵屬性，作為資料表水平式分割依據，藉由設定多個屬性值區間條件來產生分割程序，並於每筆資料寫入時，會依據其屬性值分別存放在不同的分割資料表中。同時，本研究並提出資料存取層概念，分別有(1)資料層：建置多台資料庫伺服器存放資料；(2)防護層：放置分割與還原程序，資料寫入時依照分割程序的區間條件寫入不同的存放區，資料讀取時驗證使用者是否授權給予資料內容；(3)應用層：為使用者藉由應用程式或網頁程序進行資料存取；經由此架構進行動態資料庫的資料存取，並提供即時防護與還原處理。

關鍵字

資料探勘；反資料探勘；資料庫安全；隱私保護；知識隱藏；支援向量機

並列摘要

With the vigorous development of information technology, various novel and superior data mining techniques have been continuously developed. These developments could potentially cause the disclosures of knowledge and privacy of enterprises. In the past, protection technologies proposed for data mining techniques were mostly for static databases, such as, data distribution and data modification. In comparison, the large dynamic database has fast data flows, extremely large amount of data, constant unpredictable changes and requests for immediate responses. Data protection has been focused on existing data to first provide protection to the third element which might result in some data not being protected. This happens when data increases over time but did not go undergo the first time treatment. The privacy data or knowledge of enterprises could be accessed by illegal users which threaten their survival and future development. This is a gap in the protection technique. Currently, there is little research about static database protection. Therefore, this thesis proposed a real-time anti-data mining technique to resolve the existing data mining security problems. The proposed method is a decision tree used to obtain key attributions of tables. The key attributions are used in the horizontal partitioning of the table in order to set the attributing conditions for the partition procedure. Each record is stored according to the attributes to the different table partition. In summary, the proposed concepts are (1) Data layer: many database servers are set up to store data. (2) Protection layer: the partition and recovery procedure are created, data are saved to different storage zones according to the partition conditions of the partition procedure, and the users’ authority could be verified when data is read. (3) Application layer: the user accessed data via application or web system. Real-time protection and recovery processes are provide during the access of data in dynamic database.

並列關鍵字

Data mining ； Anti data mining ； Database security ； Privacy preservation ； Knowledge hiding ； Support Vector Machine

參考文獻

[1] M. Aizerman, E. Braverman, and L. Rozonoer, “Theoretical foundations of the potential function method in pattern recognition learning,” Automation and Remote Control, Vol. 25, 821-837, 1964.

[4] M. Brand, “Fast online SVD revisions for lightweight recommender systems,” SIAM International Conference on Data Mining, pp. 37-46, 2003.

[5] M. P. S. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. W. Sugnet, T. S. Furey, M. Ares Jr and D. Haussler, “Knowledge-based analysis of microarray gene expression data by using support vector machines,” PNAS, Vol. 97, No. 1, 262-267, 2000.

[6] M. Balabanovic and Y. Shohan, “Fab: Content-Based, Collaborative Recommendation,” Communications of ACM, Vol. 40, No. 3, 66-72, 1997.

[10] T. S. Chen, J. Chen, B. J. Tu, and Y. H. Kao, “A Novel Anti-Competitive Learning Neural Network Technique against Mining Knowledge from Databases,” Proceedings of the 2009 World Congress on Software Engineering, 383-386, 2009.

國際替代計量

動態資料庫之即時防護資料探勘技術

未授權

主題瀏覽