透過您的圖書館登入
IP:3.144.84.155
  • 期刊

利用文字內容主題特徵與機器學習方法探討MIS相關期刊在ISI資料庫的主題分類

A Study of the Subject Categorization of the MIS-related Journals in the ISI Databases Using Topical Features in the Text Content and Machine Learning Methods

摘要


本研究利用主題模型、期刊群集與類別預測等方法,分析與討論ISI主題類別IS&LS的MIS相關期刊中同時被賦予Management類別的情形。在期刊群集實驗裡,所有被指定到Management類別的期刊及其它同樣具有相似主題特徵的期刊都被聚集在同一個期刊群集內,「管理」是其共同且最突顯的主題。由於此群集包含的期刊和先前研究的MIS群集大多相同,因此視為本研究的MIS群集。類別預測實驗使用分類迴歸樹方法,分別以ISI的Management類別以及本研究的MIS群集做為正案例,進行期刊類別預測。兩次試驗產生的分類樹都以「管理」主題的出現機率為主要的分類規則,但後者不僅分類樹較為單純,同時預測錯誤也較少。也就是若將MIS群集內所有期刊都指定到Management類別,會使檢索的成效更為周全有效。

並列摘要


In this study we analyzed and discussed that the MIS-related journals under the ISI subject category of IS&LS are simultaneously given with subject category Management, using methods of topic modeling, journal clustering and subject category prediction. In the experiment of journal clustering, all journals under subject category Management and other journals also having similar topical features can be gathered into a cluster, and "management" is their common and the most distinct topic. Because the journals belonged to this cluster are almost same to those in the MIS clusters generated by the previous studies, we considered it as the MIS cluster in this study. In the second experiment, we used the classification and regression tree (CART) technique to predict assignment of subject category with that the journals in the original subject category Management and in the MIS cluster produced in this study as positive examples, respectively. The trees generated by the two tests both used the occurring probabilities of the topic "management" as the main classification rule. However, in the latter test, we did not only obtain a simpler classification tree but also had a result with less predicting errors. This means that if all journals in the MIS cluster could be given with subject category Management, the retrieval results can be more effective and complete.

參考文獻


林頌堅(2014)。以主題模型方法為基礎的資訊計量學領域研究主題分析。教育資料與圖書館學。51(4),499-523。
林頌堅(2014)。資訊科學期刊的主題分布與多樣性研究。圖書資訊學研究。9(1),171-200。
Abrizah, A.,Noorhidawati, A.,Zainab, A. N.(2015).LIS journals categorization in the Journal Citation Report: A stated preference study.Scientometrics.102(2),1083-1099.
Blei, D. M.,Ng, A. Y.,Jordan, M. I.(2003).Latent Dirichlet allocation.The Journal of Machine Learning Research.3,993-1022.
Blondel, V. D.,Guillaume, J.-L.,Lambiotte, R.,Lefebvre, E.(2008).Fast unfolding of communities in large networks.Journal of Statistical Mechanics: Theory and Experiment.2008(10)

延伸閱讀