基於 PySyft 安全聚集之聯盟式學習應用的效能分析與優化

為了顧及隱私的考量，聯盟式學習成為一個可以使多個分散式客戶端合作地訓練一個共享模型同時保護他們自己訓練資料的一個新興機器學習技術。雖然聯盟式學習能降低資料外洩的風險，但駭客仍可對客戶端訓練過的模型作逆向工程來獲取客戶端訓練資料的相關資訊。為了避免這樣的風險，透過安全聚集的方法可以私密地結合各個客戶端訓練好的模型並且更新共享的模型。然而，安全聚集往往會因為需要額外的加密運算甚至是安全多方計算使用所需的通訊而帶來效能上的負擔。在這篇論文中，我們分析透過 PySyft 實現的安全聚集之聯盟式學習，PySyft 是一個包含聯盟式學習實作的開源程式碼框架，除此之外，我們提出非同步的聯盟式學習機制來改進整體效能。我們可以發現整體效能取決於客戶端的運算能力以及網路通訊的特性，我們因此提出一個效能建模的方法來幫助系統設計師了解整體執行時間的分配來對隱私、效率以及精確度作適當的取捨，設計一個平衡的系統。

關鍵字

資料隱私；聯盟式學習；安全聚集；效能建模

並列摘要

To address privacy concerns, federated learning (FL) is becoming a promising machine learning technique which enables multiple decentralized clients to train a shared model collaboratively while preserving their private training data. Although FL may reduce the risks of data leak, it is still possible for hackers to reverse-engineer a trained model and figure out the information in the original training dataset provided by a FL client. In order to avoid such risks, secure aggregation (SA) can be used to privately combine the trained models of the clients to update the shared model. However, SA usually introduces performance overhead as it requires additional computation for encryption operations and even communications when secure multi-party computation (SMPC) is used. In this paper, we analyze the performance of FL with SA using PySyft, an open source framework including FL implementation, and propose an asynchronous FL mechanism to improve the overall performance. It turns out that the performance depends on the computational capabilities of the clients and the characteristics of the communication network, and we propose a performance modeling method to help system designers break down the execution time and decide on suitable trade-offs between privacy, efficiency, and accuracy for a balanced system.

並列關鍵字

Data Privacy ； Federated Learning ； Secure Aggregation ； Performance Modeling

參考文獻

K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, J. Konečnỳ, S. Mazzocchi, and H. B. McMahan. Towards federated learning atscale: System design.arXiv preprint arXiv:1902.01046, 2019.

Google Scholar

K. Bonawitz, F. Salehi, J. Konečnỳ, B. McMahan, and M. Gruteser. Federated learning with autotuned communicationefficient secure aggregation. In2019 53rd AsilomarConferenceonSignals,Systems,andComputers, pages 1222–1226. IEEE, 2019.

Google Scholar

K. Chandiramani, D. Garg, and N. Maheswari. Performance Analysis of Distributedand Federated Learning Models on Private Data.Procedia Computer Science,165:349–355, 2019.

Google Scholar

I. Damg\aard, V. Pastro, N. Smart, and S. Zakarias. Multiparty computation fromsomewhat homomorphic encryption. InAnnual Cryptology Conference, pages 643–662. Springer, 2012.

Google Scholar

M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. InProceedings of the 22nd ACMSIGSAC Conference on Computer and Communications Security, pages 1322–1333,2015.

Google Scholar

國際替代計量

基於 PySyft 安全聚集之聯盟式學習應用的效能分析與優化

全文下載

主題瀏覽