透過您的圖書館登入
IP:3.133.156.156
  • 期刊
  • OpenAccess

「日中パラレルコーパス」の構築とコロケーション研究―CTLJとの比較を例に―

「日漢平行語料庫」之建構與語詞搭配研究-與「台灣日語學習者語料庫」(CTLJ)之比較為例-

摘要


本論文除了介紹日漢平行語料庫之建構方法之外,並結合語料庫大量語料及統計法之功能,通過日漢平行語料庫與台灣日語學習者語料庫(CTLJ)之語詞搭配比較,檢視自然語料庫和學習者語料庫的語詞搭配情形之差異;並探討這兩種性質相異之語料庫得到之語詞搭配比較情形透露出何種訊息?我們針對已公開CTLJ之原文部份與日漢平行語料庫之日語部份,使用自行開發之Collocation Tool中Compare的KL-Divergence功能來比較兩個語料庫間之語詞搭配情形,結果得知:若某搭配語(Collocation Pair)在CTLJ或平行語料庫之數值算出皆為高分正數,則可能是兩者分數都很高,即該語詞搭配在兩方都出現過多次。但是,若某一搭配語在CTLJ中算出之數值為負值,而在平行語料庫內為正值,則可能是該語詞搭配是學習者少用,可能是需要學習的搭配。反之,若某一搭配語在CTLJ中算出之數值為正值,而在平行語料庫(日語部份)內為負值,則可能是該語詞搭配對學習者是誤用。綜上,透過CTLJ(原文)和平行語料庫(日語)之比較,可有效呈現學習者少用或常誤用之搭配語。

並列摘要


The construction and collocation analysis of the Japanese - Chinese Parallel Corpus is presented in this paper. In addition to the large amounts of data and methods of statistics while comparing the parallel corpus with CTLJ, the analysis of different usage of collocation is also included in this study.A self-designed KL-Divergence function of Collocation Tool, is used to compare those original texts in public corpus, CTLJ and parallel corpus. The results show that a collocation pair is often used if it accounts the used frequency with both positive and high grade in CTU and in parallel corpus. When a collocation pair accounts the used frequency with a negative number in GTLJ, but with a positive number in parallel corpus, this collocation pair is probably seldom used or need to be learned. In contrast, when a collocation pair accounts the using frequency with a positive score in CTLJ, but with a negative one in parallel corpus, this collocation pair is probably misused. In comparison with CTLJ and parallel corpus, the seldom used or easily misused collocation pairs are known effectively.

參考文獻


毛文偉(2008)。關於語料庫研究的若干理論思考。日語學習與研究。139,27-31。
盧慧娟()。,未出版。
盧慧娟(2006)。以對比語料庫為本之「語詞搭配」研究。淡江外語論叢。8,159-176。
解志強(2002)。中譯英時的詞彙搭配問題。長榮學報。5(2),135-149。
徐一平編、曹大峰編(2005)。中日對譯語料庫的研制與應用研究。北京:外語教學與研究出版社。

延伸閱讀