基於Serverless FaaS之MLOps框架設計與研究

近年來由於雲端運算的快速發展下，使得大數據分析技術以及系統服務的需求不斷的增長。許多企業和雲服務提供商都開始透過人工智慧（Artificial Intelligence, AI）技術如機器/深度學習（Machine/Deep Learning）模型來處理大量收集到的數據並創造新的應用價值。但是使用AI技術是非常耗時且複雜的運算任務，因為它可能需要配備擁有高性能GPU的基礎設施（Infrastructure）系統以及需要手動設定複雜的各階段執行環境及配置以用於模型的訓練。這導致了開發人員在開發和訓練模型中難以提供快速的交付，同時也影響了企業來快速的提供模型服務來獲得更大的商業價值。此外，目前已有多種MLOps來改善機器學習應用服務的開發流程。但這些框架都是採用微服務（Microservices）進行模型開發。而使用微服務開發的模型服務即使在沒有任務需求時，也需佔用部分運算資源來維持模型服務的狀態，而導致了運算資源的耗損以及使用率低的情況。因此，本研究將設計一套基於事件驅動的ML FaaS(Function as a Service)系統。其以Kubernetes做為執行ML Function的管理及調度，提供機器學習應用服務從開發、訓練及部署模型等階段建置端到端的自動化ML Pipeline流程，並使用無伺服器運算（Serverless Computing）來解決運算資源佔用問題。而根據本研究的實驗結果顯示，透過本研究提出的MLOps框架可提高系統運算資源的使用率，以及在同時執行多個ML Pipeline流程的情況下，可明顯地降低流程總執行時間。

關鍵字

無伺服器運算；功能即服務；容器； Kubernetes ；機器學習； MLOps

並列摘要

In recent years, due to the rapid development of cloud computing, the demand for big data analysis and system services has continued to grow. Many enterprise and cloud service providers have begun to use artificial intelligence (AI) technology such as machine learning or deep learning model to handle a large amount of collected data and create new application value. However, the use of AI technology is a very time-consuming and complex task, because it may need to be equipped with a high-performance GPU infrastructure system and need to manually set up a complex execution environment and configurations for various stages to be used for model training, which leads to difficult for developers to provide rapid delivery in the development and training of models. At the same time, it also affects enterprises to quickly provide model services to obtain greater commercial value. In addition, there are currently a variety of MLOps frameworks to improve the development process of machine learning application services. And, these frameworks all use microservices for model development. The model service developed using microservices needs to occupy some computing resources to maintain the state of the model service even when there is no task demand, which leads to the consumption of computing resources and the low utilization rate. Therefore, this study will design a set of event-driven ML FaaS (Function as a Service) system. The proposed system uses Kubernetes as the management and orchestration of executing ML Function, and provides machine learning application services to build an end-to-end automated ML Pipeline process from the stages of development, training and deployment models. And, the proposed system use serverless computing to solve the problem of computing resource occupation. According to the experimental results of this study, the MLOps framework proposed can increase the utilization rate of system computing resources, and can reduce the total execution time of the process dramatically when multiple ML Pipeline processes are executed at the same time.

並列關鍵字

Serverless Computing ； Function as a Service ； Containers ； Kubernetes ； Machine Learning ； MLOps

參考文獻

Google Scholar

[1] "數位轉型應用趨勢，私有雲、公有雲、混合雲與多雲，哪種優勢多適合企業", https://www.mile.cloud/zh/resources/blog/21 (accessed 2021)

Google Scholar

[2] "CLOUD COMPUTING IaaS vs PaaS vs SaaS", https://www.redhat.com/en/topics/cloud-computing/iaas-vs-paas-vs-saas (accessed 2021)

Google Scholar

[3] Johannes Thönes, " Microservices", Volume 32, Issue 1, Jan-Feb. 2015

Google Scholar

[4] Mike Roberts, John Chapin, "What is Serverless", O'Reilly Media, Inc. June 2017

Google Scholar

國際替代計量

基於Serverless FaaS之MLOps框架設計與研究

未授權

主題瀏覽