Title

中文情緒詞庫的建造與標記

Translated Titles

Affective Lexicon in Chinese – Construction and Annotation

DOI

10.6342/NTU201602978

Authors

呂珮瑜

Key Words

情緒指稱詞 ; 情緒示意詞 ; 情緒詞 ; 語意韻律 ; 詞組塊 ; emotion denoting words ; emotion signaling words ; emotion words ; semantic prosody ; chunk

PublicationName

臺灣大學語言學研究所學位論文

Volume or Term/Year and Month of Publication

2015年

Academic Degree Category

碩士

Advisor

謝舒凱

Content Language

英文

Chinese Abstract

情緒詞表為情緒偵測研究的基礎資源,考量現有開放性中文詞表大都以情緒指稱詞(affect-denoting)為主,關於情緒示意詞(affect-signaling)的收錄較為缺乏,然而從認知語意以及語用學的角度而言,情緒示意詞卻在情緒表達的語言使用上扮演極為關鍵的角色,情緒韻律(semantic prosody)說明看似中性的詞,其實隱涵了正負偏向的關聯;而情緒與語言的對應往往跨越了字詞的邊界,詞組塊(chunk)也能表達情緒,而非固定的一詞對照一情緒。因此本研究將現有中文情緒指稱詞詞表整合分類,並且人工收集與標記中文情緒示意詞,作為中文情緒偵測研究的基礎資源,同時也證明功能語法在文本中情緒辨識的功用。本研究分為兩階段,第一階段為人工收集、標記與分類,第二階段為詞表的評測與應用。第一階段將情緒指稱詞從現有詞表整合且分類,分為高興、難過、害怕、生氣、驚訝五類,再依據該詞指稱的情緒強度與持續時間細分至情緒(emotion)、心情(mood)、脾氣(temperament)三類之中。另一方面,情緒示意詞的收集則從兩個角度的語料庫進行: 作者分類的情緒文章(PTT心情版900篇),讀者分類的情緒文章(Yahoo心情新聞1000篇),從中進行詞組塊的人工標記與分類。此外,也收錄常見的情緒用語,如:感嘆詞、表情符號、髒話與辱罵詞等。第二階段評測分為兩部分,第一步估算每個情緒示意詞的情緒預測能力,該數值為文本語料庫中每次該詞出現後接十個詞的情緒分數平均值。第二部為檢驗該預測能力,將情緒示意詞抽取正負各十組,由情緒詞加總的簡易計算法,以人工評分的情緒文本為標準,比較有情緒示意詞的情況,在準確率上的提升:正向詞組平均提升4.78%,負向詞組18.18%。最後,應用方面,使用於Magistry et al (2015)的中文短文情緒偵測機器學習研究,F1分數提升近2%。

English Abstract

Affective lexicon is the fundamental resource for sentiment detection. However, most existing Chinese affective lexicon is mainly about affect-denoting words and lacks of affect-signaling words. From the aspect of cognitive semantics and pragmatics, affect-signaling words play a critical role in emotion expression of language use. Semantic prosody explains neutral words would have association with positive or negative polarity, while the functional theory shows the connection between words and meaning is not one-on-one, neither is the connection between words and emotion. The corresponding of emotion and language expression might beyond the boundaries of words: chunks. Therefore, the research aims to collect annotate affect-signaling words and organize it with affect-denoting words into a multi-dimensional affective lexicon in Chinese. The function of the result is not only for the open resource for sentiment analysis, but also as an evidence of how functional grammar works in sentiment detection in texts. Two phases of process involve in the research. First is manual collection, annotation, and categorization of affective lexicon. Second is the evaluation and application. In first stage, affect-denoting words are categorized into 5 categories (happy, sad, scared, angry, and surprised) and 3 levels (emotion, mood, temperament), according to the strength and duration. On the other hand, affect-signaling words are collected and annotated from two sources of database: author-oriented emotional articles (from BBS) and reader-oriented emotional news (from yahoo news). Besides, the common emotion expression words are collected as well, including interjections, emoticons, and expletives. In phase two, the emotion-prediction ability of each affect-signal words is calculated by the mean scores of emotion value in the following ten words. To measure the result, the random sample of affect-signaling words are added in the NTUSD as the affective lexicon for sentiment analysis to compare the accuracy with/without affect-signaling words. The promotion of the accuracy in positive affect-signaling words is 4.78% while the negative one is 18.18%. In the application, the whole affective lexicon is applied on an unsupervised machine leaning approach to sentiment detection of micro-blog data in Chinese (Magistry et al, 2015), and yields the promising result of nearly 2% improvement in the original F1-score.

Topic Category 人文學 > 語言學
文學院 > 語言學研究所
Reference
  1. Baider, F & Cislaru. G. eds. (2014). Linguistic approaches to emotion in context. John Benjamins Company.
    連結:
  2. Barrett, L. F., Lindquist, K. A., & Gendron, M. (2007). Language as context for the perception of emotion. Trends in cognitive sciences, 11(8), 327-332.
    連結:
  3. Besnier, N. (1990). Language and affect. Annual review of anthropology, 419-451.
    連結:
  4. Cambria and A. Hussain. (2012). Sentic computing. Springer.
    連結:
  5. Clark, L. A., Watson, D., & Mineka, S. (1994). Temperament, personality, and the mood and anxiety disorders. Journal of abnormal psychology, 103(1), 103.
    連結:
  6. Cohen, A. A., & Harrison, R. P. (1973). Intentionality in the use of hand illustrators in face-to-face communication situations. Journal of Personality and Social Psychology, 28(2), 276.
    連結:
  7. De Spinoza, B., & Curley, E. M. (1985). The collected works of Spinoza. vol. 2: Princeton University Press, 1985.
    連結:
  8. Derks, D., Bos, A. E., & Von Grumbkow, J. (2007). Emoticons and social interaction on the Internet: the importance of social context. Computers in human behavior, 23(1), 842-849.
    連結:
  9. Ekman, P., & Friesen, W. V. (1981). The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Nonverbal communication, interaction, and gesture, 57-106.
    連結:
  10. Ekman, P., et al. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of personality and social psychology, 53(4), 712.
    連結:
  11. Ekman, P., Sorenson, E. R., & Friesen, W. V. (1969). Pan-cultural elements in facial displays of emotion. Science, 164(3875), 86-88.
    連結:
  12. Foolen, A. (2012). The relevance of emotion for language and linguistics.Moving ourselves, moving others: Motion and emotion in intersubjectivity, consciousness and language, 349-369.
    連結:
  13. Jack, R. E., Garrod, O. G., & Schyns, P. G. (2014). Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Current biology, 24(2), 187-192.
    連結:
  14. James, W. (1884). II.—What is an emotion?. Mind, (34), 188-205.
    連結:
  15. Jay, T. B., & Danks, J. H. (1977). Ordering of taboo adjectives. Bulletin of the Psychonomic Society, 9(6), 405-408.
    連結:
  16. Kovecses, Z. (2000). The scope of metaphor. Metaphor and metonymy at the crossroads: A cognitive perspective, 30, 79.
    連結:
  17. Lee, S. Y. M., Chen, Y., Li, S., & Huang, C. R. (2010a). Emotion Cause Events: Corpus Construction and Analysis. In LREC.
    連結:
  18. Li, J, Xu, Y, Xiong, H & Wang, Y. (2010). Chinese Text Emotion Classification Based On Emotion Dictionary.
    連結:
  19. Liu and L. Zhang. (2012). A survey of opinion mining and sentiment analysis. Mining Text Data, Springer US, 415–463.
    連結:
  20. Murphy, B. (2010). Corpus and sociolinguistics: Investigating age and gender in female talk (Vol. 38). John Benjamins Publishing.
    連結:
  21. Norrick, N. R. (2009). Interjections as pragmatic markers. Journal of pragmatics, 41(5), 866-891.
    連結:
  22. Pang and L. Lee. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2):1–135.
    連結:
  23. Parkinson, B. (Ed.). (1996). Changing moods: The psychology of mood and mood regulation. Addison-Wesley Longman Limited.
    連結:
  24. Partington, A. (1998). Patterns and meanings: Using corpora for English language research and teaching (Vol. 2). John Benjamins Publishing.
    連結:
  25. Picard, R. W., & Picard, R. (1997). Affective computing (Vol. 252). Cambridge: MIT press.
    連結:
  26. Plutchik, R. (1980). Emotion: A psychoevolutionary synthesis. Harpercollins College Division.
    連結:
  27. Speer, R., & Havasi, C. (2013). ConceptNet 5: A large semantic network for relational knowledge. In The People’s Web Meets NLP. Springer Berlin Heidelberg, 161-176.
    連結:
  28. Speer, R., & Havasi, C. (2013). ConceptNet 5: A large semantic network for relational knowledge. In The People’s Web Meets NLP . Springer Berlin Heidelberg, 161-176.
    連結:
  29. Staiano, J., & Guerini, M. (2014). DepecheMood: a Lexicon for emotion analysis from crowd-annotated news. arXiv preprint arXiv:1405.1605.
    連結:
  30. Stubbs, M. (1996). Text and corpus analysis: Computer-assisted studies of language and culture. Oxford: Blackwell.
    連結:
  31. Talmy, L. (2003). Toward a cognitive semantics (Vol. 1). MIT press.
    連結:
  32. Taylor, J. R. (2012). The mental corpus: how language is represented in the mind. Oxford University Press.
    連結:
  33. Thompsen, P. A., & Foulger, D. A. (1996). Effects of pictographs and quoting on flaming in electronic mail. Computers in Human Behavior, 12(2), 225-243.
    連結:
  34. Turner, J. H. (1996). The evolution of emotions in humans: A Darwinian–Durkheimian analysis. Journal for the theory of social behaviour, 26(1), 1-33.
    連結:
  35. Turner, J. H. (2000). On the origins of human emotions: A sociological inquiry into the evolution of human affect. Stanford, CA: Stanford University Press.
    連結:
  36. Walther, J. B., & D’Addario, K. P. (2001). The impacts of emoticons on message interpretation in computer-mediated communication. Social science computer review, 19(3), 324-347.
    連結:
  37. Watson, D., & Clark, L. A. (1992). On traits and temperament: General and specific factors of emotional experience and their relation to the five‐factor model. Journal of personality, 60(2), 441-476.
    連結:
  38. Weigand, E. (Ed.). (2004). Emotion in dialogic interaction: advances in the complex (Vol. 248). John Benjamins Publishing.
    連結:
  39. Wierzbicka, A. (1999). Emotions across languages and cultures: Diversity and universals. Cambridge University Press.
    連結:
  40. Wilce, J. M. (2009). Language and emotion (No. 25). Cambridge University Press.
    連結:
  41. 卓淑玲, 陳學志, & 鄭昭明. (2013). 台灣地區華人情緒與相關心理生理資料庫─ 中文情緒詞常模研究. Chinese Journal of Psychology, 55(4), 493-523.
    連結:
  42. 黃金蘭, 林以正, 謝亦泰, & 程威銓. (2012). 中文版 [語文探索與字詞計算] 詞典之建立. Chinese Journal of Psychology, 54(2), 185-201.
    連結:
  43. Albaugh, Q., Sevenans, J., Soroka, S., & Loewen, P. J. (2013). The Automated Coding of Policy Agendas: A Dictionary-Based Approach. In 6th Annual Comparative Agendas Conference, Atnwerp, Beligum.
  44. Alm, C. O., Roth, D., & Sproat, R. (2005). Emotions from text: machine learning for text-based emotion prediction. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (pp. 579-586). Association for Computational Linguistics.
  45. Aman, S., & Szpakowicz, S. (2007). Identifying expressions of emotion in text. In Text, Speech and Dialogue (pp. 196-205). Springer Berlin Heidelberg.
  46. Baccianella, S., Esuli, A., & Sebastiani, F. (2010). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. InLREC, 10, 2200-2204.
  47. Bednarek, M. (2008). Emotion talk across corpora. New York: Palgrave Macmillan.
  48. Bublitz, W. (2002). Emotive Prosody: how attitudinal frames help construct context. In Anglistentag (pp. 381-392).
  49. C. M. Cheng, H. C. Chen, and S.-L. Cho. (2012) Affective words. In A Study on Standard Stimuli and Normative Responses of Emotion in Taiwan.
  50. Chen, Y., Lee, S. Y. M., Li, S., & Huang, C. R. (2010). Emotion cause detection with linguistic constructions. In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 179-187.
  51. Clark, L. A., & Watson, D. (1999). Temperament: A new paradigm for trait psychology. Handbook of personality: Theory and research, 2, 399-423.
  52. Darwin, C. (1859). On the origins of species by means of natural selection. London: Murray, 247.
  53. Ekman, P. (1984). Expression and the nature of emotion. Approaches to emotion, 3, 19-344.
  54. Ekman, P. E., & Davidson, R. J. (1994). The nature of emotion: Fundamental questions. Oxford University Press.
  55. Esuli and F. Sebastiani. (2006). SentiWordNet: A publicly available lexical resource for opinion mining. In Proceedings of the Conference on International Language Resources and Evaluation (LREC), Genova, IT, 417–422.
  56. Jay, T. (1992). Cursing in America: A Psycholinguistic Study of Dirty Language in the Courts, in the Movies, in the Schoolyards, and on the Streets. John Benjamins Publishing.
  57. Jay, T. (1999). Why we curse: A neuro-psycho-social theory of speech. John Benjamins Publishing.
  58. Kovecses, Z. (2003). Metaphor and emotion: Language, culture, and body in human feeling. Cambridge University Press.
  59. Ku, L. W., & Chen, H. H. (2007). Mining opinions from the Web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology, 58(12), 1838-1850.
  60. Lee, S. Y. M., Chen, Y., & Huang, C. R. (2010b). A text-driven rule-based system for emotion cause detection. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics, 45-53.
  61. Lu, P. Y., Chang, Y. Y., & Hsieh, S. K. (2013). Causing Emotion in Collocation: An Exploratory Data Analysis. In ROCLING.
  62. Ortony, A., Clore, G. L., & Collins, A. (1990). The cognitive structure of emotions. Cambridge university press.
  63. Paltoglou, M. Thelwall, and K. Buckley. (2010). Online textual communications annotated with grades of emotion strength. In Proceedings of the 3rd International Workshop of Emotion: Corpora for research on Emotion and Affect, 25–31.
  64. Schnall, S. (2005). The pragmatics of emotion language. Psychological Inquiry, 28-31.
  65. Speer, R., & Havasi, C. (2012). Representing General Relational Knowledge in ConceptNet 5. In LREC (pp. 3679-3686).
  66. Strapparava and A. Valitutti. (2004). WordNet-Affect: an affective extension of WordNet. In Proceedings of the Conference on International Language Resources and Evaluation (LREC), Lisbon, May, 1083 – 1086.
  67. Wilson, J. Wiebe, and R. Hwa. (2004). Just how mad are you? finding strong and weak opinion clauses. In Proceedings of AAAI, 761–769.
  68. Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 347-354.Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 347-354.
  69. Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing . Association for Computational Linguistics, 347-354.
  70. 何容. (1981). 國語日報字典. 國語日報社
  71. 張嘉文. (2009). 標準國語字典. 俊嘉文化出版
  72. 陳建美. (2009). 中文情感詞彙本體的構建及其應用 (博士論文, 大連: 大連理工大學).
  73. 謝旻琪. (2011). 如何捷進寫作詞彙:語言動作篇. 商周出版
  74. 謝旻琪. (2012). 如何捷進寫作詞彙:人物篇. 商周出版
  75. Project
  76. "Standard Stimuli and Normative Responses of Emotions (SSNRE) in Taiwan," Project website at http://ssnre.psy.ntu.edu.tw/ , 2012.
Times Cited
  1. 王雅詩(2017)。基於詞性組合的意見字典擴增方法之研究。淡江大學資訊管理學系碩士班學位論文。2017。1-62。