Integrated Information Service Platform: Applications of Information Visualization and Intelligent Analysis to Big Data

指導教授 : 廖崇碩


隨著網際網路、行動裝置及社交媒體的普及,流通於全球的資料量呈現爆炸性的增長,國際數據資訊(International Data Corporation, IDC)估計從2013到2020年資料量將會增加10倍,總量將從4.4兆Gigabytes成長到44兆Gigabytes,資料的暴增讓巨量資料(Big Data)的分析及應用成了極為重要的課題,若能適當運用這些資料,將可為各方面帶來新的價值與創意,為資訊及服務的新趨勢。 本論文嘗試結合資訊服務平台的概念,整合視覺化技術和智慧分析功能來提供多元的資訊服務,建置一整合型資訊服務平台,以提供巨量資料之分析工具,並促進智慧的雲端服務應用之發展。在視覺化技術中,本論文改良繪圖工具建置一套iCircos自動視覺化工具,讓資料經上傳即可自動繪製出結構完整之圖形,降低使用者操作門檻並減少學習操作之時間,協助進行資料之判讀及分析,提供自動視覺化服務;而在智慧分析功能中,本論文結合智慧分析工具於資訊服務平台中,讓資料之分析不受限於使用者的知識及背景,提升資料分析於資訊服務之應用。平台提供了三大服務:資料過濾服務、資料視覺化服務以及資料分析技術,協助進行巨量資料的過濾、分析及視覺化,使資料呈現的更快速且更具價值。 本論文並使用了醫療領域之疾病資料作為案例應用,分析結果可協助使用者及專家從龐大的實驗資料中獲取更多具有價值之資訊。其中,視覺化技術可輔助快速鑑定境外移入病例及找出相關地域關聯性;另外,智慧分析模組可發掘出與基因調控機制有關的基因、研究基因突變產生抗藥性原因與藥物治療對策等等。平台成果能夠作為巨量資料之服務平台雛型,並與長庚醫療生物中心合作,成功輔助其目標區域定序(Targeted Sequencing)醫療系統之開發,證明本論文建置之資訊服務平台的有效性。最後,我們期待此平台能提供使用者在其他研究領域的巨量資料之處理、分析及視覺化服務,作為專業學者深度研究之參考依據。


With the rapid development of the Internet, mobile devices and social media, the available data around the world have grown explosively. IDC (International Data Corporation) estimates that the amount of data will increase by 10 times from 2013 to 2020; precisely, the total amount may grow from 4.4 trillion gigabytes to 44 trillion gigabytes. The explosion causes the analysis and applications of Big Data to become an important research issue. If the Big Data can be appropriately used, it will bring new values and innovations for life as well as a new trend of information systems and services. This paper attempts to incorporate the functions of information visualization and intelligent analysis of Big Data into the concept of service platforms. We build an integrated information service platform to provide diverse services for users, and promote the development of cloud service applications. This platform provides three kinds of main services: data filtering, data visualization and intelligent analysis, which help users represent the data in a faster and more effective way. In addition, we demonstrate the usefulness of our platform by testing medical disease data as case studies. The result shows that our platform can effectively help users to retrieve valuable information from the vast experimental data. More specifically, to the visualization tool helps identify imported diseases and its location areas. Moreover, the intelligent analysis detects new test reagents and the antibiotic resistance genes for drug development. This prototype can thus offer data analysis service on Big Data via this user-friendly interface. Finally, we collaborated with Chang Gung Molecular Medicine Research Center by applying our function modules-automatic visualization and intelligence analysis tools to their Targeted Sequencing System. Therefore, our platform can successfully support the construction of other information service systems. In the future, we expect that the platform can provide an integrated service of data processing, data analysis, and data visualization on Big Data in other research fields.


