  • 學位論文


Automated Collocation Suggestion in Academic Writing

指導教授 : 張俊盛 劉顯親


近幾年來,搭配詞(Collocation)的使用在外語教學的領域中已經被廣泛的探討,搭配詞也常被指稱為是有效提升語言學習者語言能力的關鍵。而在學術論文寫作中,不少學者也開始注意到此語言現象,並發現此項知識會連帶影響到整體學術寫作的品質。 過去的研究指出有效的搭配詞學習,需要學習者在字彙的學習過程中自我意識到搭配詞的重要性,才能逐漸依此強化其搭配詞的學習。但是這樣的策略在實際的課堂中有時卻很難實行,語言教學者本身不但需要有足夠的搭配詞知識,同時語言教學者也被期望在教學過程中,能適時地給與學生們搭配詞使用上的建議,但此類建議的給予對於教學者而言卻是相當費時費工。除此之外,如果希望透過語言科技來幫助搭配詞建議,有時卻有窒礙難行之處。一個適切的搭配詞建議,不但需要準確的瞭解其語義上的限制,在語用上也需要依實際情形斟酌考慮,因此在目前語言科技的發展上,搭配詞的建議問題仍舊是未解並需要進一步深入探究的。 在本論文中,為期能幫助學習者在學術論文的撰寫過程中能獲得適當的搭配詞協助,我們試圖去探討是否能透過機器學習的方法來建立一個輔助寫作工具。以一個需要搭配詞建議的文句作為輸入,我們透過資料的訓練建立了分類器,並將其分類的結果視為該搭配詞的建議依據。而為了建立一個有效的搭配詞分類器,如何選擇分類所需的特徵值即顯重要。因此,我們透過學術文類的語料庫收集,並將語料中相關建議字詞的上下文資訊整合訓練該分類器,以期能透過分類器自動選取相對應的字詞作為建議。 我們針對學生常犯錯誤的動名搭配詞進行實驗,實驗結果顯示利用上下文資訊所訓練的分類器的確能有效的提供搭配詞建議,並能提供良好的建議排名。此結果也顯示我們針對學術論文所提出的寫作輔助架構,能確實多面向地提供學術論文中所需的搭配詞建議。


The concept of collocation has been widely discussed in the field of language teaching for decades. It has been shown that collocation is important in helping language learners achieve native-like fluency. In the field of English for academic purpose, there are also more and more researchers recognizing this important feature in academic writing. It is often argued that collocation can influence the effectiveness of a piece of writing and the lack of such knowledge might cause cumulative loss of precision. Previous research indicates effective collocation acquisition needs learners’ awareness while they learn vocabulary. However, this strategy might not be easy to apply in a real-life classroom. We not only need to equip language instructors with rich knowledge of collocation but also need to help instructors correct students collocation errors, which is labor intensive and time consuming. In addition, to automate collocation suggestion via language technology requires considerable efforts. A proper collocation suggestion might involve knowing the correct semantic as well as pragmatic usages. It is thus still an unresolved issue in need of particular attention. In our thesis, we prove the feasibility of using a machine learning method to build a writing assistant which is aimed at automatically prompting learners with collocation suggestions in academic writing. Given an input sentence, which requires collocation suggestions, we build a data-driven classifier and treat the outcome of the classification as suggested substitutions in question. Moreover, for a robust classifier, feature selection is the key component. We make use of the target’s contextual linguistic clues to elicit the most relevant suggestions from the reference corpus of scholarly texts. We carried out an experiment focusing on one of the major types of collocation problems, verb-noun collocations. The proposed classifier along with contextual information can satisfactorily return suggestions with the best hit rank in the experiment. Our framework of computer-assisted academic writing can facilitate learner-writers’ collocation uses and help to transfer that knowledge to their future writing.


Benson, M., Benson, E. & Ilson, R. (1986). The BBI Combinatory Dictionary of English: A Guide to Word Combinations. Philadelphia: John Benjamins.
Benson, M. (1990). Collocations and General-Purpose Dictionaries. International Journal of Lexicography, 3(1), 23-35.
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213-238. Averil Coxhead’s website: http://language.massey.ac.nz/staff/awl/index.shtml
Chang, Y., Chang, J., Chen, H., & Liou, H. (2008). An automatic collocation writing assistant for Taiwanese EFL learners: A case of corpus-based NLP technology. Computer Assisted Language Learning, 21(3), 283-299.
Chen, Q., & Ge, G.C. (2007). A corpus-based lexical study on frequency and distribution of Coxhead’s AWL word families in medical research articles (RAs). English for Specific Purpose, 26, 502-514.


