透過您的圖書館登入
IP:18.189.180.76
  • 學位論文

語言探索與字詞計算詞典2015簡體中文版之修訂與應用

Revision and Application of SC-LIWC2015

指導教授 : 林瑋芳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


文本分析為心理學的發展帶來了重要的影響,語言是人類溝通與表達內心想法的橋樑,是研究人員探詢個體內心世界的鑰匙。從最初對夢境與口誤內容的分析,到後來「投射測驗」的發展,直到計算機時代來臨後幾經轉折而研發出的語言探索與字詞計算 (Linguistic Inquiry and Word Count,簡稱LIWC) 程式,為文本心理學分析提供了極大的便利,將質化研究與量化研究進行了連結。 LIWC的核心假設在於透過計算特定類別的詞彙使用頻率,來表徵個體內在的心理歷程。LIWC詞典由Pennebaker團隊研發,詞典中定義了數十個類別與所屬詞彙,歷經十餘年的發展,從LIWC到如今的LIWC2015。中文版LIWC詞典最早由黃金蘭等(2012)修訂LIWC2007版本,後續則由林瑋芳等(2020)修訂了LIWC2015版本。然而,由於華人文化涉及廣泛,中文使用者除繁體中文外,還有簡體中文,因此對於簡體中文版語言探索與字詞計算詞典之需求日漸增加,故此引發本研究之動機:以繁體中文版LIWC2015詞典為母本,建立簡體中文版LIWC2015詞典,擴展LIWC對於簡體中文領域的文本分析。 本文共分為四個研究,研究一的目的在於修訂簡體中文版LIWC2015詞典;研究二修訂針對簡體中文版LIWC2015詞典中的網路詞類別進行在地化修訂並檢驗其有效性。研究三與研究四分別對修訂完成的簡體版中文LIWC2015詞典進行效度檢驗,並且研究三針對不同斷詞系統進行了比較討論。綜合四個研究,確立簡體中文版LIWC2015詞典的效果穩定,可作為未來探討簡體中文使用者心理特性之研究工具。

並列摘要


How to detect individuals’ thoughts and inner processes? It has become one of the most important issues in Psychology. Previous research usually applied content analysis on text data, which is based on subjective judgements and would take lots of time and efforts. With the advance of technology, more and more computer-based text analysis methods were developed recently. Linguistic Inquiry and Word Count (LIWC), developed by Dr. Pennebaker, is one of the computerized text analysis tools. The main assumption of LIWC is that the frequency of word usage plays as a language marker to index individuals’ feelings and thoughts. LIWC has been widely used in various fields, including psychology, education, and health. Although LIWC consists of two parts, a dictionary that defines a bunch of categories and words belonging to those categories, and a computer program that calculates the frequency of word usage in each category, for former is without doubt the key component. The original LIWC dictionary was in English and has been translated and revised into more than ten languages. The first Chinese version of LIWC dictionary (CLIWC2007) was revised by Huang et al. (2012), and then followed by a revision of the latest version (CLIWC2015) by Lin et al. (2020). However, both CLIWC2007 and CLIWC2015 was based on the data of Traditional Chinese characters and may be inappropriate for text analysis in Simplified Chinese. Thus, the main purpose of this thesis is to conduct the Simplified Chinese version of LIWC (SCLIWC) dictionary. We conduct four studies to establish a validated SCLIWC dictionary. Study 1 provides the details of the revise process of SCLIWC2015. Study 2 focuses on the revision of the “netspeak” category. Netspeak collects words which are commonly used in social media or short message service. Study 3 and Study 4 dealt with different topic of text samples to examines the validity of SCLIWC2015. Study3 examines the psychological differences between individuals that are still in their romantic relationship and those who has broken up. We processed all text samples with two different word-segmentation tools, CKIP and Jieba, to examine whether word-segmentation influences the result of LIWC or not. Study 4 examines the validity of CLIWC. We collect text samples on two topics, work experiences or sad emotion experiences. Half samples are collected on Weibo, which are in Simplified Chinese, and analyzed by SCLIWC2015, and the other half are collected on PTT, which are in Traditional Chinese, and analyzed by LIWC2015. We find a consistent result in both SCLIWC2015 and CLIWC2015. Across four studies, our finding suggests that SCLIWC2015 has good detection rate and validity for texts in Simplified Chinese.

參考文獻


中文文獻:
黃群弼(2008)。中文繁簡等義詞自動辨識之研究(碩士論文)。政治大學。https://hdl.handle.net/11296/wuct4a
黃金蘭(Chin-Lan Huang); Cindy K. Chung; Natalie Hui; 林以正(Yi-Cheng Lin); 謝亦泰(Yi-Tai Seih); Ben C. P. Lam; 程威銓(Wei-Chuan Chen); Michael H. Bond; James W. Pennebaker(2012)。中文版「語文探索與字詞計算」詞典之建立。中華心理學刊,54(2),185-201。doi:10.6129/CJP.2012.5402.04
金慧蘭(2006)。現代漢語新詞研究(碩士論文)。政治大學。http://nccuir.lib.nccu.edu.tw/handle/140.119/35617
林瑋芳、黃金蘭、林以正(2014)。來得早不如來得巧:中庸與陰陽轉折的時機。中國社會心理學評論,7,89-107。

延伸閱讀