聯邦式學習提供一種解決方案,使多個客戶端平台能夠在不洩漏各自擁有的數據情況下共享知識,並學習高性能模型。目前大多數聯邦式學習研究都依賴一個共有模型,對所有客戶端的資料進行預測。當每個客戶端的資料為非獨立同分佈時,單一共有模型難以應用在所有客戶的資料上。此外,多數的研究都假設聯邦式學習框架中的所有客戶端都使用相同的模型架構,這與實際的應用場景有所差距。本論文提出一種新穎的聯邦式學習框架,適應於客戶端的資料為非獨立同分佈的狀況。同時,我們提出的框架也允許客戶端能夠使用不同的模型架構。在實驗規劃上,本論文構建獨立同分佈及非獨立同分佈兩種型態的數據集,在同質和異質模型架構上探索所提出的框架。在MIMIC-III這個公開資料集,進行疾病分類的實驗,結果顯示無論客戶端使用同質或異質模型架構,以及無論使用獨立同分佈或是非獨立同分佈的數據集,所提的聯邦學習框架都展現有效的成果。
Federated learning provides a solution that enables multiple client platforms to share knowledge and learn high-performance models without leaking their own data. At present, most federal learning researches rely on a common model to predict data from all clients. When the data of each client is not IID distributed, it is difficult to apply a common model to fit into the data of all clients. In addition, most studies assume that all clients in the FL framework use the same model architecture, which is different from real world scenarios. This paper proposes a novel federated learning framework which adapts to the situation where the client's data is not iid distributed. At the same time, the proposed framework also allows clients to use different model architectures. In terms of experimental planning, this thesis constructs both iid and non-iid distributed datasets. Furthermore, we explore the proposed framework on both homogeneous and heterogeneous model architectures. Experiments on disease classification are carried out in a public dataset, MIMIC-III. Experiment results show that regardless of whether the clients use homogeneous or heterogeneous model architectures, and whether they use iid or non-iid distributed datasets, the proposed federated learning framework is effective.