我們時常將標點符號視為語句上可呼吸停頓的位址,然而,並不是所有停頓都發生在標點符號的位址,也不是所有標點符號都會停頓。本篇論文中,我們介紹一個可以針對英文語言學習者輸入的演講文稿自動推薦適當的停頓點的系統。在使用的方法中,我們必須將演講文稿裡面的標點符號去除,並且產生適當的特徵。其中包括自動產生標記停頓點的訓練資料、自動針對訓練資料產生文字上的特徵值,並且自動訓練分類器協助判斷停頓點。最終的評估顯示我們提出的方法在針對標記停頓點上有相當不錯表現。
Punctuation marks in text usually tend to be taken as breath pauses. However, not all pauses occur at punctuation marks, and, in fact, not all punctuations are designed to be pauses. In this paper, we introduce a method for suggesting speech pauses for a given script submitted by English language learners. In our approach, a text is transformed into a non-punctuated text with features aimed at suggesting appropriate pauses in speech. The method involves automatically generating training data annotated with pauses, automatically transform the training data into linguistic features, and automatically training a discriminative classifier. Evaluation shows that the proposed method achieves a satisfactory performance in suggesting pauses in given speech.