  • 學位論文


A Comment Valuation Model Based on Comment Contents and Structure

指導教授 : 侯建良


當讀者欲對於其感興趣之知識有更深入之認識時,其往往透過各類網路管道發掘其感興趣之知識相關文件並逐一選讀之,再由其選讀之文件中吸收其認可或感興趣之內容,達成獲得相關知識之目的。然而,讀者透過各類管道所蒐集之相關知識文件的數量眾多且其內容繁雜,使讀者需花費大量時間逐一閱讀尋得之文件,才能由所有尋得之文件中整理並吸收相關知識。其次,目前讀者選擇閱讀文件之依據仍以其對於文件第一印象之主觀評價或來自其他讀者之主觀評價為主,此作法容易遺漏其他未閱讀文件中所傳達之重要資訊。 為解決上述問題,本研究乃發展一「以文件內容架構為基礎之評論文件價值評估」模式,為建構此價值評估模式,本研究乃先釐清評論文件關鍵特質之擷取方式,並根據所整理之擷取方式擷取評論文件之關鍵特質;之後,為建構評論文件關鍵特質與其對應資訊價值指標間之關係,本研究乃運用粒子群演算法之概念求解「評論文件關鍵特質對資訊價值表現影響推論」最佳化模型,以取得評論文件之最佳關鍵特質權重組合,並可依獲得之最佳關鍵特質權重組合建構評論文件關鍵特質對資訊價值表現之關係模式和對應之系統,以藉由視覺化方式呈現評論文件之資訊價值,協助讀者選讀文件群中之評論文件;最後,本研究乃以「Mobile01」網路論壇和「Yahoo!奇摩知識+」之討論文件為基礎進行系統之績效驗證,以藉由實際之評論文件測試所開發的評論文件價值評估系統,而由驗證結果可得知,本研究所開發之系統可有效地擷取評論文件之關鍵特質,並使讀者有效率的選讀符合其需求之評論文件。


Readers often search relevant articles via various search engines as they want to have better understanding about domain knowledge they are interested in. They can read through the articles to make sure if the articles meet their needs. After that, readers can absorb contents of the article they are interested in to acquire the domain knowledge. However, articles collected from multiple search engines are great in number and complicated and readers tend to rely on their first impressions on the articles or other subjective evaluations for article selection. Under such circumstances, the important information might be missed. In order to solve the above problem, a comment valuation model based on comment content and structure is developed in this study. Firstly, key attributes of a comment are extracted and defined. After that, the Particle Swarm Optimization (PSO) algorithm is applied to solve the optimization model for the impact of key attributes on comment value evaluation in order to derive an optimal weight for the attributes. The derived weights can be further used to establish the relationship between key attributes and valuation indexes of comments. Values of a target comment can be calculated and visualized via the relationship model. Furthermore, real-world cases from “Mobile01” internet forum and “Yahoo! Knowledge+” are used to evaluate the feasibility and performance of the proposed methodology and platform. As a whole, the proposed methodology and platform can effectively and efficiently extract key attributes from comments in order to assist readers select the important comments from discussion threads.


7. Chang, F., Chu, S.-Y. and Chen, C.-Y., 2005, "Chinese document layout analysis using an adaptive regrouping strategy," Pattern Recognition, Vol. 38, No. 2, pp. 261-271.
8. Wang, Y., Phillips, I. T. and Haralick, R. M., 2006, "Document zone content classification and its performance evaluation," Pattern Recognition, Vol. 39, No. 1, pp. 57-73.
11. Wang, C., Lu, J. and Zhang, G., 2007, "Mining key information of web pages: A method and its application," Expert Systems with Applications, Vol. 33, No. 2, pp. 425-433.
12. Collins, C., Carpendale, S. and Penn, G., 2009, "DocuBurst: Visualizing document content using language structure," Computer Graphics Forum, Vol. 28, No. 3, pp. 1039-1046.
13. Chau, M. and Chen, H., 2008, "A machine learning approach to web page filtering using content and structure analysis," Decision Support Systems, Vol. 44, No. 2, pp. 482-494.
