透過您的圖書館登入
IP:18.222.119.148
  • 學位論文

華語為二語學習者之搭配詞能力發展:台灣華語文測驗學習者寫作語料分析

Development of Chinese L2 collocation competence: A corpus-based analysis of learner texts from TOCFL

指導教授 : 陳正賢
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本研究旨在分析華語為二語學習者在華語文能力測驗(TOCFL)所寫的文章中二字詞(bigram)的詞彙關聯和分佈的發展。作者檢視四個精熟度級別的二語學習者所產出的文本,總計2836篇,並評估該文本中所有連續的兩字詞,將它們與來自中央研究院現代漢語平衡語料庫的華語母語者的雙字搭配詞組進行比較。作者採用中研院現代漢語平衡語料庫之二字詞表作為詞典,取得四種詞彙指標: mutual information(MI)、Delta P、inverse document frequency(IDF)和unseen rate對二語文本中的二字詞(bigram)進行分析,以評估二語學習者之搭配詞組能力。本論文進行了兩項統計分析:二因子變異數分析(two way ANOVA)和事後趨勢分析(post hoc trend analysis)。二因子變異數分析一方面檢驗了學習者精熟度級別和L2文本文體之間的關係,另一方面檢驗了學習者詞彙指標之得分。 研究結果發現,在MI、backward Delta P和unseen rate,文體對詞彙指標分數有明顯的交互作用。中級學習者表現出最低的平均MI分數,這可能是導因於中級學習者的詞彙量增加與其實驗心態。此外,Backward Delta P分數沒有隨著級別上升而有明顯的增加趨勢。唯一的上升發生在B1信件,這可能是起因於B1學習者所使用的局部語法結構(local constructions)。另外,在C1中觀察到的unseen bigram大多被認為是分歧的表述(divergent representations),這表明進階學習者努力想出一些組合來表達他們的想法,即使這些組合可能不為大多數母語者使用。另外,在forward Delta P和IDF中,文體與詞彙指標分數不存在交互作用。Forward Delta P分數隨著級別而增加,這反映人類語言處理的從左到右的方向。高IDF之二字詞在A2和C1學習者中更為普遍被使用,原因是A2與C1學習者使用了相當多與現實生活相關或者特定領域二字詞。本論文對多元層面的二語學習者二字詞能力進行了全面的分析,並強調華語為二語教學中,單字以外的多字詞組能力之重要性。

並列摘要


The current study evaluates the development of the bigrams’ lexical associations and distributions in texts written by Chinese as a second language (L2) learners during the TOCFL writing test. Four proficiency levels were included for analysis, amounting to 2836 compositions in total. All contiguous two-word combinations in L2 texts were evaluated by comparing them to Chinese native speakers’ collocation patterns taken from a reference corpus, the Academia Sinica Balanced Corpus of Modern Chinese. To examine the relationship between learners’ proficiency levels and their multifaceted collocation competence, four distributional metrics were adopted—mutual information (MI), Delta P, inverse document frequency (IDF), and unseen rate. With the help of Sinica Corpus’ bigram list as a dictionary, co-occurring two-word combinations in L2 texts were given collocability scores to assess their collocation competence. The current thesis performed two quantitative analyses: two-way ANOVA and post hoc trend analysis. The two-way ANOVA test examined the relationship between learners’ proficiency levels and L2 text genres on the one hand and their collocability scores on the other. It has been found that in MI, backward Delta P and unseen rate, GENRE has a significant interaction on the collocability metric. In MI, although different genres show varying developments across the proficiency levels, the intermediate levels show the lowest mean MI scores, which could be attributed to L2 learners’ increase in vocabulary size and the experimental minds. In backward Delta P, no ascending trend is found. The only increase is shown in B1 letters, which could be attributed to the emergence of local grammatical constructions at the B1 level. The unseen bigrams observed in C1 are mostly regarded as divergent legitimate representations, showing that the advanced learners have tried hard to come up with words to convey their ideas even if these word combinations may not be commonly used by most native speakers. On the contrary, in forward Delta P and IDF, there is no LEVEL and GENRE interaction. Forward Delta P has a positive linear trend across all proficiency levels, which reflects human language processing's preferred left-to-right orientation. The use of high-IDF bigrams is more common among A2 and C1 learners for two reasons. The employment of bigrams that are relevant to real-life circumstances may have resulted in high-IDF bigrams at the A2 level. Domain-specific issues may explain the high-IDF bigrams at the C1 level. The current thesis provides a comprehensive analysis of the multifaceted L2 collocation competence, and highlights the importance of formulaicity beyond single words in CSL.

參考文獻


References
Altenberg, B., & Granger, S. (2001). The grammatical and lexical patterning of MAKE in native and non-native student writing. Applied Linguistics, 22(2), 173-195. http://dx.doi.org/10.1093/applin/22.2.173
Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases. Journal of Memory and Language, 62(1), 67-82. http://dx.doi.org/10.1016/j.jml.2009.09.005
Bannard, C., & Lieven, E. (2012). Formulaic language in L1 acquisition. Annual Review of Applied Linguistics, 32, 3-16. https://doi.org/10.1017/S0267190512000062
Bannard, C., & Matthews, D. (2008). Stored word sequences in language learning: The effect of familiarity on children's repetition of four-word combinations. Psychological Science, 19(3), 241-248. http://dx.doi.org/10.1111/j.1467-9280.2008.02075.x

延伸閱讀