FedDRA: 用於垂直分割之聯邦式學習之考量隱私的降維演算法

垂直式聯邦式學習可以在隱私洩露不會發生的情況下，讓多家擁有同樣本但不同資料特徵的機構們共同學習一個模型。然而，這樣的學習前提是多家機構必須用有足夠的樣本重疊數必須要夠多，否則模型效果可能會不彰。在現實情形下，機構中擁有的資料裡能被認定為是可使用資料（重疊樣本）是非常少的，大部分的資料都無法被垂直式聯邦學習所使用。因此，我們提出了一個聯邦式降維演算法 (FedDRA)，可以優化上述情況發生時的模型效果。我們替判別成分分析(discriminant component analysis）這個降維演算法演算法設計了一套安全的計算協議，使之可以符合垂直式聯邦式學習的隱私保護要求。此外我們設計特殊的優化，使利用那些為無重疊資料來使模型結果仍維持較好的準確性。我們主要的貢獻主要有三，一是在可使用重疊樣本有限時，我們仍維持較好的效能。二是相比於該領域其他研究，我們擁有最低的預測期傳輸成本，在實務上更有效率。三是我們是第一個提供了一種較為彈性垂直式聯邦式學習演算法，可以搭配各種下游任務或後續演算法使用。

關鍵字

機器學習；垂直式聯邦學習；合作學習；隱私保障；監督式特徵投射

並列摘要

Vertical federated learning (VFL) enables different parties that have different features of the same sample to learn a machine learning model together without exposing their own data. However, it is essential that all collaborative parties share enough overlapping samples to ensure the performance of VFL models. In reality, the fraction of overlapping samples is usually insufficient and most non-overlapping samples are unutilized. Therefore, we propose a dimensionality reduction algorithm for vertical federated learning, a supervised projection approach that improves the performance of VFL models under insufficient training samples. We adapt discriminant component analysis(DCA) to VFL settings and combine the information extracted from non-overlapping parts into the FedDRA(federated Dimensionality Reduction Algorithm). Our main contributions are three: 1. Performing well with a small number of overlapping samples. 2. It is a adaptable VFL framework that can be combined any data analysis technology 3. When compared to other VFL works, it has the lowest communication cost during prediction.

並列關鍵字

Machine learning, Vertical federated Learning, ； Vertical federated Learning ； Collaborative learning ； Pri-vacy preserving ； Supervised subspace projection

參考文獻

[1] Abbas Acar, Hidayet Aksu, A Selcuk Uluagac, and Mauro Conti. A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys(Csur), 51(4):1–35, 2018.

Google Scholar

[2] Chih-Chung Chang and Chih-Jen Lin. Libsvm: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3):1–27, 2011.

Google Scholar

[3] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.

Google Scholar

[4] Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, Dimitrios Papadopoulos,and Qiang Yang. Secureboost: A lossless federated learning framework. IEEE Intelligent Systems, 36(6):87–98, 2021.

Google Scholar

[5] Bo Dai, Bo Xie, Niao He, Yingyu Liang, Anant Raj, Maria-Florina F Balcan, and Le Song. Scalable kernel methods via doubly stochastic gradients. Advances in neural information processing systems, 27, 2014.

Google Scholar

國際替代計量

FedDRA: 用於垂直分割之聯邦式學習之考量隱私的降維演算法

未授權

主題瀏覽