文本分析的第一步驟是拆解文本表達,以過濾掉無意停用詞的干擾,而篩選出表達重點的關鍵詞,這對學術資料庫尤為重要。本研究樣本湯普森路透社科學研究網WOS資料庫,檢索來臺灣學者於1995至2024發表於Scientometrics期刊之256篇學術論文,再以會計accounting與商業business主題檢索,探索WOS是否存在誤判之情形?一、論文摘要陳述「納入考慮」into account或「貢獻」accounting for之表達,被系統誤以為該論文屬於會計學領域;二、論文摘要「是否相關」of business之表達,被系統以以為該論文屬於商業領域。本研究具體訂正上述錯誤分類,以提醒學術圈使用WOS系統時,絕對不能光靠系統之檢索分類,就躁進投入分析,而必須先從領域知識觀點,確認檢索獲得之樣本是否確實為研究主題所需?
The first step in text analysis is to parse the text so as to filter out the interference of words and to select keywords. This is especially important for academic databases. The sample of this study was retrieved from the WOS database of the Thompson Reuters Scientific Research Network, and 256 academic papers published by Taiwanese scholars in the Scientometrics 1995 to 2024. The sample was further investigated using the topic words accounting and business. We try to examine if the WOS misclassifies the sample. First, the expression "into account" or "accounting for" in the paper abstracts result in judging those papers belonging to accounting category. Second, the expression "of someone's business" in the paper abstracts result in judging those papers belonging to business category. This study specifically revised the above-mentioned error classification to remind the academic community that when using the WOS system, they must not rely solely on the system's search classification to rush into analysis. Instead, they must first confirm whether the samples obtained through the search are indeed the research subject from the perspective of domain knowledge.