透過您的圖書館登入
IP:3.12.41.87
  • 學位論文

考量旅館特徵建構旅館線上評論摘要

Text Summarization for Online Hotel Reviews Based on Hotel Features

指導教授 : 胡雅涵
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


由於TripAdvisor.com這類旅遊社群網站的興起,旅客可以從平台上瀏覽世界各地旅館的評論。旅遊平台上,光是一家旅館就有成千上萬的評論,有著資訊超載的情形,網站雖然有依評論日期排序的功能,但是旅客還是無法有效率地分辨符合自身需求的評論。因此,對於旅客來說,如果有能夠從眾多評論濃縮成重要的資訊,產出旅館的摘要功能,是非常便利的事。 本研究是提出自動化產生評論摘要的方法,可以分為四階段,收集評論與資料前處理;建構評論有益性預測模型,考量因素有評論品質、評論者和評論情感等,篩選出有益的評論;評論內容依照旅館的特徵分類句子,分成六個句子集合;計算各集合中的句子重要性分數,挑選分數較高的句子,產生旅館的評論摘要。實驗評估由以下四種方法產出的摘要作比較,(A)未篩選有益評論,未考量旅館特徵的分類,(B)未篩選有益評論,只考量旅館特徵的分類(C)只作評論有益性分析,未考量旅館特徵的分類,(D)本研究方法,由30位受試者評估出最佳的結果。 由實驗結果發現,有益性預測模型是選用隨機林的方法效能最好,重要的研究變項以評論者的變項占大多數;摘要評估是採用本研究方法產出的摘要較受青睞,並由ANOVA分析本研究方法與其他方法有顯著差異。不同於過去文字摘要的研究,較少同時考量資料來源的品質和評論句子中的旅館特徵。本研究產生摘要的方式,希望能提供未來相關的研究上參考。

並列摘要


In the early work of text summarization, they focused on using text processing techniques, such as TF-IDF and semantic approaches, disregarding other useful information that could be extracted from online social media. With the rapid growth of the social media, tourists like to explain their experience use many different and complex aspects to refer to the features of the hotel. Despite the increasing of the reviews, the quality of reviews is not very well. To improve previous researches, this research proposes a text summarization method that select the representative reviews and then classify reviews by hotel feature. Our research process can be divided into four main steps: data preprocessing, review helpfulness analysis, review sentence classification, and text summarization. There are four approaches in this study: (A) the approach only performed the text summarization phase; (B) the approach ignored the helpfulness analysis phase, and others performed; (C) the approach ignored the sentence classification phase, and others performed; (D) our research method. 30 subjects are invited to rank the four text summarization result generated by approaches. The research result suggests that the reviewer features are the most impact variables in the review helpfulness analysis and our research method is more significant than other method through the statistical ANOVA method.

參考文獻


Atkinson, J., & Munoz, R. (2013). Rhetorics-based multi-document summarization. Expert Systems with Applications, 40(11), 4346-4352.
Cao, Q., Duan, W., & Gan, Q. (2011). Exploring determinants of voting for the “helpfulness” of online user reviews: A text mining approach. Decision Support Systems, 50(2), 511-521.
Coleman, M., & Liau, T. L. (1975). A computer readability formula designed for machine scoring. Journal of Applied Psychology, 60(2), 283.
Fattah, M. A., & Ren, F. (2009). GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Computer Speech & Language, 23(1), 126-144.
Ferreira, R., de Souza Cabral, L., Lins, R. D., Pereira e Silva, G., Freitas, F., Cavalcanti, G. D. C., et al. (2013). Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications, 40(14), 5755-5764.

延伸閱讀