透過您的圖書館登入
IP:18.191.181.231
  • 學位論文

應用機器學習和情感分析預測臺北市Airbnb房源價格

Applying Machine Learning and Sentiment Analysis to Predict Airbnb Listing Prices in Taipei City

指導教授 : 羅竹平
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


網路科技進步和行動裝置的普及,訊息的傳遞成本獲得大幅下降,人與人對等式(Peer to Peer)交流日漸普遍,也讓共享經濟(Sharing Economy)獲得高速的發展,Airbnb乘著這波潮流出現於世。不同於過去傳統的旅館飯店服務, Airbnb在價格和房源類型上擁有更多選擇,提供租客不一樣的住宿體驗。 對住宿業來說,房源的定價直接決定了收益的多寡,過高或過低的價格都可能導致房東和租客雙方權益受損,但是由於共享經濟的房源大多來自於閒置資源,過去對旅館業所使用的訂價參數無法使用,故本研究欲開發可靠的價格預測模型,以幫助租房業者和租客都能夠進行房源的價格預測,以確保自身和彼此權益。 本研究使用情感分析(sentiment analysis)將過去租客的留言加入解釋變數。再利用p-value解釋變數篩選和Lasso解釋變數篩選,將資料轉化成五個擁有不同解釋變數的資料,最後運用多元線性回歸模型(Multiple Linear Regression)、脊迴歸模型(Ridge Regression)、支援向量迴歸模型(Support Vector Regression)、隨機森林迴歸模型(Random Forest Regression)、Extreme Gradient Boost(XGBoost)和神經網絡模型(Neural Network),六個機器學習模型對臺北市Airbnb的房源進行建模和房源價格預測。 在眾多模型的比較下,以模型解釋力 R^2和均方根誤差RMSE為判斷依據,以擁有和RMSE等於0.4764和R^2 等於0.6434的XGBoost在Lasso解釋變數篩選下表現最優,故選擇其作為臺北市Airbnb房源價格預測的模型。本研究結果顯示臺北市Airbnb的房源價格較不受臺北市行政區劃分所影響,也不受房源與最近捷運站之距離所影響,房源價格於各行政區有高低價格的差異可能為其他的解釋變數所導致。

並列摘要


With the advancement of Internet technology and the popularization of mobile devices, the cost of transmitting information has been greatly reduced. Nowadays, Peer-to-peer communication has become more common, and the sharing economy has also achieved rapid development. Airbnb was created under this circumstance. Different from the traditional hotel and hotel services in the past, Airbnb has more choices in terms of prices and types of listings, providing tenants with a different accommodation experience. For the hotel industry, the pricing of housing listings directly determines the amount of income. Overpriced or underpriced may damage the rights of both landlords and tenants. However, since most of the housing listings in the sharing economy come from idle resources, the parameters used in the pricing hotel industry cannot be used for predicting Airbnb listing prices. Thus, this study intends to develop a reliable price forecasting model to help both landlords and tenants to predict the price of the house, so as to ensure both of their rights and interests. This study uses sentiment analysis to quantify past tenants' comments as explanatory variables. By using p-value feature selection and Lasso feature selection, converting the data into five different data with different explanatory variables. Finally, build the Taipei City Airbnb listings price forecasting model by using Multiple Linear Regression, Ridge Regression, Support Vector Regression Model, Random Forest Regression, Extreme Gradient Boost (XGBoost) and Neural Network Model. In the comparison of multiple models, the result shows that the XGBoost model with 0.4764 RMSE and 0.6434 R squared performs the best under the Lasso feature selection. The model was chosen as a model for the price prediction of Airbnb listings in Taipei City. Finally, the experiment of this study shows that the price of Airbnb listings in Taipei City is not affected by the division of administrative districts in Taipei City, nor is it affected by the distance between the listings and the nearest MRT station. Fluctuations in Airbnb listings prices in Taipei City come from other explanatory variables.

參考文獻


邱志洲、蔡易潤、呂奇傑(2015)。建構兩階段多目標之類免疫支援向量迴歸模式於股價預測。數據分析,10(5),1-30。
洪得洋、林祖嘉(1999)。臺北市捷運系統與道路寬度對房屋價格影響之研究。住宅學報,8(8),47-67。
張祐禎(2017)。影響「共享經濟」使用者意願之因素。國立臺灣大學農業經濟學系碩士學位論文。
黃宜瑜、劉文燦、楊雅君(2012)。體驗經濟對民宿價格影響之研究-特徵價格法之應用。行政院國家科學委員會計畫。東海大學景觀學系。
魏暄庭(2018)。Airbnb房源出租價格之決定因素-特徵價格法之應用。台灣大學農業經濟學系碩士論文。

延伸閱讀