風工程氣動力資料庫之資料探勘架構探討及實證研究

近年來，網路的發達以及電腦運算能力強大，資料探勘的技術日漸成熟，甚至橫跨各個領域。資料探勘是一個非常具有系統的分析過程，從問題的定義、資料處理、資料探勘模式的建立、到結果解釋與評估，都有其數據依據。藉由這樣的模式，發掘出有用的資訊以提供決策支援。而淡江大學風工程研究中心隨著前人不斷累積實驗數據，已累積大量的風洞試驗數據存放於氣動力資料庫中，不只是知識的保存，也是更多的知識等待被發現、挖掘。因此，本研究將建立一個風工程氣動力資料庫之資料探勘架構，目的設計出一個有系統的資料探索流程，具有資料檢視與處理、分析模型及結果的視覺化圖表的功能。資料使用三筆淡江大學風工程研究中心氣動力資料庫中的資料，分別是橫風向的風力頻譜頻寬、低矮建物屋頂表面極值風壓的最佳設計百分比以及高層建物表面風壓的最佳設計百分比。三個資料皆會分割成訓練、驗證、測試的資料進行模型的驗證與評估。分析工具使用SAS Enterprise Miner 14.1；分析方法是結合決策樹及迴歸分析模型，決策樹的演算法使用分類迴歸樹( Classification and Regression Trees, CART )。分析流程分為兩個部分，決策樹以及迴歸分析。決策樹的部分，目標得到變數重要性的權重以及重要規則；迴歸分析的部分，先使用決策樹中得到的規則將資料分組，將資料分組訓練，目標得到準確又簡單的迴歸公式。以平均絕對誤差百分比( Mean Absolute Percentage Error, MAPE )及實際值-預測值圖做準確度的標準。本研究從橫風向的風力頻譜頻寬開始，得到架構的設計及評估，再以後兩筆資料進行架構的驗證。本研究的結果得到MAPE低於10%及迴歸公式項次小於6項的使用模式以及各資料重要規則的現象詮釋。

關鍵字

資料探勘；決策樹；迴歸分析； SAS Enterprise Miner ；預測模型；渦散頻率；極值風壓；最佳設計百分比

並列摘要

Because of the rapid progress of Internet and computers, data mining techniques become mature and widely used in all fields. Data mining is a systematic analysis process that has data support in each stage, from problem definition and structuring, data preparation, model construction and result evaluation and interpretation. With data mining, it is easy to extract valuable information and help in making decisions. TKU-WERC accumulates lots of experimental data in aerodynamic database. It is not only for data reservation but also knowledge discovery. This study is intended to build up a data mining framework for wind engineering aerodynamic database, and uses 3 cases to do the case validation. The purpose of study is to design a systematic data discovery process, with functions of data inspection and preparation, modeling, evaluation and data visualization. The data used in the 3 cases are: the acrosswind spectrum bandwidth near vortex shedding frequency, the optimum design fractile of the low-rise building roof surface extreme pressure and the optimum design fractile of the highrise building surface extreme pressure. The tool of this study is SAS Enterprise Miner 14.1. The methodology of study combined 2 algorithm, decision tree ( CART ) and regression. The process is divided into 2 parts as follows: First, get the variable importance and rule set from decision tree, and use the rules to form data groups. Second, use regression to train data by group. The goal is to get precise and simple prediction formulas. The criteria for accuracy evaluation are Mean Absolute Percentage Error ( MAPE ) and Actual – Predicted diagram. The first case is used to design the framework and make evaluation, and the following 2 cases are for the framework validation. The results from this study acquire the regression application models with MAPE below 10% and the number of coefficient of formula below 6, and the important rule set to interpret the phenomenon of data.

並列關鍵字

Data Mining ； Decision tree ； Regression ； SAS Enterprise Miner ； Predicted model ； vortex shedding frequency ； optimum design fractile ； Extreme wind pressure

參考文獻

[23] 蕭唯倫, “應用資料探勘技術針對肇事碰撞型態建立路口分支風險和肇事因子模型,” 國立臺灣大學工學院土木工程學系碩士論文, 2015.

[2] Y. Zhou, T. Kijewski and A. Kareem, "Aerodynamic Loads on Tall Buildings: Interactive Database," Journal of Structure Engineering, ASCE, pp. 394-404, 2003.

[3] 王人牧、鄭啟明、鄧秉泰, "以風洞實驗資料庫為基礎之高層建築設計風載重專家系統," 結構工程, vol. 21, no. 2, pp. 39-51, 2006.

[6] 曾育凡, “干擾效應下的極值分布特性,” 淡江大學土木工程學系碩士論文, 2017.

[11] U. Fayyad, G. Piatetsky-Shapiro and P. Smyth, "The KDD Process for Extracting Useful Knowledge from Volumes of Data," Communication of ACM, vol. 39, no. 11, 1996.

國際替代計量

風工程氣動力資料庫之資料探勘架構探討及實證研究

全文下載

主題瀏覽