Bayesian Bridging Topic Models for Classification

We study the problem of constructing the topic-based model over different domains for text classification. In real-world applications, there are abundant unlabeled documents but sparse labeled documents. It is challenging to construct a reliable and adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poorly for the test data in a target domain. Also, the trained model is vulnerable to the weakness of classification among ambiguous classes. In this study, we tackle the issues of domain mismatch and confusing classes and conduct the discriminative transfer learning for text classification. We propose a Bayesian bridging topic models (BTM) from a variety of labeled and unlabeled documents and perform the transfer learning for cross-domain text classification. A structural model is built and its parameters are estimated by maximizing the joint marginal likelihood of labeled and unlabeled data via a variational inference procedure. We also construct the discriminative learning on our proposed model for adjust parameters by using the minimum classification error criterion. We show that improvements over cross-domain text classification using the proposed model can be achieved better performance than other models.

並列關鍵字

transfer learning ； topic model ； cross-domain classification ； latent Dirichlet allocation ； Bayesian

被引用紀錄

陳威豪（2003）。智慧型代理人於癌症相關基因之文獻資料探勘之整合與應用〔碩士論文，亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-0807200916284077

國際替代計量

Bayesian Bridging Topic Models for Classification

全文下載

主題瀏覽