透過您的圖書館登入
IP:216.73.216.100
  • 期刊

文本停用詞與關鍵詞之分辨:Scientometrics個案分析

Distinguishing Text Stop Words and Keywords: A Case Study of Scientometrics

摘要


文本分析的第一步驟是拆解文本表達,以過濾掉無意停用詞的干擾,而篩選出表達重點的關鍵詞,這對學術資料庫尤為重要。本研究樣本湯普森路透社科學研究網WOS資料庫,檢索來臺灣學者於1995至2024發表於Scientometrics期刊之256篇學術論文,再以會計accounting與商業business主題檢索,探索WOS是否存在誤判之情形?一、論文摘要陳述「納入考慮」into account或「貢獻」accounting for之表達,被系統誤以為該論文屬於會計學領域;二、論文摘要「是否相關」of business之表達,被系統以以為該論文屬於商業領域。本研究具體訂正上述錯誤分類,以提醒學術圈使用WOS系統時,絕對不能光靠系統之檢索分類,就躁進投入分析,而必須先從領域知識觀點,確認檢索獲得之樣本是否確實為研究主題所需?

並列摘要


The first step in text analysis is to parse the text so as to filter out the interference of words and to select keywords. This is especially important for academic databases. The sample of this study was retrieved from the WOS database of the Thompson Reuters Scientific Research Network, and 256 academic papers published by Taiwanese scholars in the Scientometrics 1995 to 2024. The sample was further investigated using the topic words accounting and business. We try to examine if the WOS misclassifies the sample. First, the expression "into account" or "accounting for" in the paper abstracts result in judging those papers belonging to accounting category. Second, the expression "of someone's business" in the paper abstracts result in judging those papers belonging to business category. This study specifically revised the above-mentioned error classification to remind the academic community that when using the WOS system, they must not rely solely on the system's search classification to rush into analysis. Instead, they must first confirm whether the samples obtained through the search are indeed the research subject from the perspective of domain knowledge.

參考文獻


俞明德、沈仰斌、蔡湘萍(2011)。臺灣財金系所期刊著作表現之研究:2003-2008。財務金融學刊,19(1),97-133。
張森林、洪茂蔚、劉裕宏(2007)。臺灣各大學院校在國際財金期刊之著作表現。證券市場發展季刊,19(3),1-22
李幸珍、張錦特(2015)。臺灣商管學院在國際頂尖期刊發表狀況之研究。中山管理評論,23(4),1057-1083
劉任昌(2017)。檢舉學術不端與對抗包庇惡勢的奮鬥過程。科學與人文研究,3(4),39-62。
劉任昌、吳佳芬、葉馬可(2021)。三位臺灣學者論文遭撤除事件說明。科學與人文研究,8(3),1-16。

延伸閱讀