  • 期刊
  • OpenAccess

Question Retrieval with Distributed Representations and Participant Reputation in Community Question Answering


In recent years, community-based question and answer (CQA) sites have grown rapidly in number and size. These sites represent a valuable source of online knowledge; however, they often suffer from the problem of duplicate questions. The task of question retrieval (QR) aims to find previously answered semantically similar questions in CQA archives. Nevertheless, synonymous lexical variations pose a big challenge for question retrieval. Some QR approaches address this issue by calculating the probability of correlation between new questions and archived questions. Much recent research has also focused on surface string similarity among questions. In this paper, we propose a method that first builds a continuous bag-of-words (CBoW) model with data from Asus's Republic of Gamers (ROG) forum and then determines the similarity between a given new question and the Q&As in our database. Unlike most other methods, we calculate the similarity between the given question and the archived questions and descriptions separately with two different features. In addition, we factor user reputation into our ranking model. Our experimental results on the ROG forum dataset show that our CBoW model with reputation features outperforms other top methods.


Bansal, M.,Gimpel, K.,Livescu, K.(2014).Tailoring Continuous Word Representations for Dependency Parsing.Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(ACL).(Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(ACL)).
Berger, A.,Caruana, R.,Cohn, D.,Freitag, D.,Mittal, V.(2000).Bridging the lexical chasm: statistical approaches to answer-finding.Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR '00).(Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR '00)).
Cao, X.,Cong, G.,Cui, B.,Jensen, C. S.,Zhang, C.(2009).The use of categorization information in language models for question retrieval.Proceedings of the 18th ACM conference on Information and knowledge management(CIKM '09).(Proceedings of the 18th ACM conference on Information and knowledge management(CIKM '09)).
Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013). Efficient estimation of word representations in vector space. Retrived from arXiv preprint arXiv:1301.3781
Ponte, J. M.,Croft, W. B.(1998).A language modeling approach to information retrieval.Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR '98).(Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR '98)).
