透過您的圖書館登入
IP:3.145.115.195
  • 學位論文

應用實價登錄建立以聚類方法之堆疊泛化房價預測模型 -以桃園市區分建物房價資料為例

Predicting Housing Prices using Clustering-based Stacked Generator- A study on Taoyuan City Actual Price Registration Data

指導教授 : 陳樹衡 鄧筱蓉

摘要


本研究探討結合聚類分析的堆疊泛化模型對台灣房價預測的適用性。利用最新可用的桃園市實價登錄資料, 本研究首先拓展了Trivedi et. al (2015) 的聚類分析集成學習方法,建立了一個聚類分析的兩層堆疊泛化模型。第一層聚類分析群模型分別由Lasso,KNN以及決策樹建立,第二層元模型分別由線性迴歸、隨機森林以及XGBoost所建立。接下來用此拓展的兩層聚類分析堆疊泛化模型預測了桃園市房價資料,並與其他機器學習模型,包括線性迴歸、隨機森林和XGBoost,比較他們的預測結果。

並列摘要


This research explores the applicability of combining clustering technique with stacked generalization for Taiwan housing prices prediction. Taking advantage of the most currently available Taoyuan City Actual Price Registration Data, we first expanded the clustering-based ensemble learning method by Trivedi et al. (2015) to develop two-layer clustering-based stacked generalizers. In the first layer, three machine learning methods (Lasso, KNN and Decision Tree) were used to construct the cluster models. In the second layer, Linear Regression, Random Forest and XGBoost were used to build meta models. These developed stacked generalizers are then used to predict housing prices in the Taoyuan City. Their prediction accuracies are then compared with that from other machine learning methods, including Linear Regression, Random Forest and XGBoost.

並列關鍵字

none

參考文獻


[1] Altman, N. S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician, 46(3), 175–185.
[2] Breiman, L. (1996a). Bagging Predictors. Machine Learning, 24(2), 123–140.
[3] Breiman, L. (1996b). Stacked Regressions Leo Breiman. Machine Learning, 24(1), 49–64.
[4] Breiman, L. (2001). Random Forests. Machine Learning, 45, 5–32.
[5] Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification And Regression Trees. Chapman & Hall/CRC, 368.

延伸閱讀