透過您的圖書館登入
IP:216.73.216.23
  • 學位論文

中文多方對話語料集和日語日常會話語料集在表達歉意、請求和感謝上的文化差異及其在機器翻譯上的應用

Application of Cultural Differences in Expressing Apologies, Requests and Thanks in Multi-Party Dialogue and Everyday Japanese Conversation Datasets for Machine Translation

指導教授 : 陳信希

摘要


隨著機器翻譯技術的快速發展,我們使用機器翻譯與來自不同文化背景的人進行交流變得越來越普遍。在跨文化交際中,說話人的語言能力越高,在特定情況下當說話人的語言使用與聽話人的對話文化的語用規則相悖時,說話人就越有可能被認為性格不好或缺乏社交禮儀。這種跨文化對人際關係的負麵影響問題在未來可能會變得更加嚴重,更加頻繁,因為連目前的機器翻譯都擁有與外語學習者相同或更好的語言能力。因此,機器譯員有必要進行文化意識的翻譯,即在聽眾的對話文化中適當使用語言的翻譯。 在這項研究中,我們首先確認,語言使用在不同的文化中是不同的,這取決於情景和人際關係。然後,為了通過文化意識的機器翻譯減少不同文化帶來的問題,我們創建了一個文化意識的中日對話平行語料集,包括人際關係標簽和三種情況標簽:道歉、請求和感謝。我們對我們的創建的語料集進行了統計分析。此外,我們使用了一個sequence-to-sequence模型來對我們的語料集進行情況分類和文化感知的機器翻譯。 在本研究中,我們驗證了人際關係對日文的情景分類和中翻日的文化感知機器翻譯的貢獻。此外,對於中文的情景分類和日翻中的文化感知機器翻譯來說,這一點的結果也與一個觀點相吻合。中文在不同背景資訊中語言使用會大變化的語用規則比較模糊。這些結果很重要,因為該模型可以透過包括人際關係以及背景資訊在對話文化中實現文化感知。

並列摘要


With the rapid development of machine translation technology, it is becoming increasingly common for us to communicate with people from different cultures using machine translation. In cross-cultural communication, the higher the speaker's language skills such as grammar skills, vocabulary, fluency, and so on, the more likely it is that the speaker will be perceived as having a bad personality or lacking social etiquette when the speaker's language use is contrary to the pragmatic rules of the listener's dialogue culture in a certain situation. This problem of cross-cultural negative impact on human relations is likely to become more serious and more frequent in the future since even current machine translators have the same or better language skills than foreign language learners. Therefore, machine translators must perform culture-aware translation which is a translation with the appropriate language use in the listener's dialogue culture. In this study, we first confirm that language use varies from culture to culture depending on the situation and interpersonal relationship. And then to reduce the problems caused by different cultures by culture-aware machine translation, we created a culture-aware parallel dataset for Chinese and Japanese dialogues including interpersonal relationship labels and three situation labels: apology, request, and thanks. We applied statistical analysis to our dataset. Besides, we used a sequence to sequence model to perform situation classification and culture-aware machine translation on our dataset. We verified that the interpersonal relationship contributes to the situation classification for Japanese and the culture-aware machine translation for Chinese to Japanese translation on our dataset in this study. Besides, for Japanese to Chinese translation, the result is consistent with a view of Chen (2013) that Chinese have less clear rules of pragmatic for using different language uses depending on the interpersonal relationships. These results are important in that the model can be culture-aware in dialogue culture with contextual information including the interpersonal relationship.

參考文獻


20 newsgroups. URL http://qwone.com/~jason/20Newsgroups/.
Chat translation task ­ emnlp fifth conference on machine translation. URL http://www.statmt.org/wmt20/chat-task.html.
Translation quality. URL https://www.deepl.com/en/quality.html.
News­commentary v16. URL https://opus.nlpl.eu/News-Commentary.php.
Yelp open dataset. URL https://www.yelp.com/dataset.

延伸閱讀