以非監督式學習之類神經網路進行面向導向之意見摘要

評論意見摘要的目的是由使用者對商品的多篇評論內容，生成融合多篇評論意見的精簡文字內容。為了避免監督式學習訓練資料集的標示成本，且使意見摘要結果聚焦於評論中表達的商品面向意見。本論文擴展Meansum模型，提出以非監督式學習概念為基礎之面向句擷取-意見摘要生成兩階段摘要生成系統。本系統在第一階段先以非面向句濾除的處理，將評論中的雜訊句篩除，再從留下的評論句子學習轉換到面向特徵向量，挑選出具有面向的可能性夠高之句子組成簡要的評論文本。在第二階段，採用Meansum架構作為摘要生成器時，訓練及測試評論資料皆先經過第一階段的處理。在Amazon不同商品類別評論資料集上之實驗結果顯示：透過本論文提出的非面向句濾除方法，以挑選過的評論句再進行面向句編碼器的訓練，與直接將評論句輸入進行訓練的面向句編碼器相比，擷取Precision的平均提升幅度為16%。而對於多評論之摘要生成，透過兩階段作法，在生成的摘要中涵蓋評論中重要面向的指標分數(Aspect_weight)，較Meansum的生成效果增進幅度至少在14%以上。

關鍵字

面向擷取；摘要生成；意見摘要

並列摘要

The purpose of opinion summarization is to generate a concise text content from multiple reviews of users on a product. The challenges of this task include avoiding the labeling cost of training data for supervised learning and making opinion summarization focus on product aspects expressed in the reviews. This thesis extended the Meansum model and proposed a two-stage method for summarizing multiple reviews, which includes sentence extraction from reviews and summary generation from multiple documents both by unsupervised learning. In the first stage, a non-aspect sentence filtering process is designed to filter out the noisy sentences in the reviews. Then the remaining sentences are used to train an aspect encoder, which aims to encode a sentence into a feature vector of aspects. After that, the sentences in a review with a high possibility on a certain aspect are selected to form a brief review. In the second stage, before performing the Meansum model as the summary generator, the training and test review data are processed by the first stage. Performance evaluation was performed on Amazon’s review datasets with various product categories. The results showed that, by using the proposed non-aspect sentence filtering method to select the remaining review sentences for training the aspect encoder, the Precision of identifying aspect sentence achieve 16% improvement on various category data in average than the aspect encoder without using the filtering method. The result of the proposed two-stage approach for opinion summarization of multiple reviews gets better Aspect_weight on the generated summaries, which achieves 14% improvement than the score of Meansum.

並列關鍵字

Aspect extraction ； Summary generation ； Opinion summarization

參考文獻

[1] Stefanos Angelidis and Mirella Lapata. 2018. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In EMNLP .

Google Scholar

[2] Trung Bui, Arman Cohan, Walter Chang, Franck Dernoncourt , Nazli Goharian , Doo Soon Kim and Seokhwan Kim. 2018. A discourse-aware attention model for ab-stractive summarization of long documents. In NAACL-HLT 2018.

Google Scholar

[3] Deng Cai, Xiaozhong Liu, Haojie Pan, Rui Wang, Rongqin Yang and Xin Zhou. 2020. Large scale abstractive multi-review summarization (lsars) via aspect align-ment. In SIGIR. pp. 2337–2346.

Google Scholar

[4] Ziqiang Cao,Wenjie Li, Sujian Li and Furu Wei. 2017. Improving multi-document summarization via text classification. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI’17). pp. 3053–3059.

Google Scholar

[5] Giuseppe Carenini, Raymond Ng, and Adam Pauls. 2006. Multi-Document Summarization of Evaluative Text. In Proc. of the Conf. of the European Chapter of the Association for Computational Linguistics.

Google Scholar

主題瀏覽