透過您的圖書館登入
IP:3.144.252.140
  • 學位論文

中文語音文件自動標題設定之進一步研究

Improved Automatic Title Generation for Chinese Spoken Documents

指導教授 : 李琳山
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


科技的快速發展帶給我們許多便利,也使得生活上所得資訊不再只限於文字,而有了多媒體與語音文件。為了方便整理和快速瀏覽,我們先使用自動語音辨識將多媒體文件的語音訊號轉寫 (Transcribe) 為文字後再做處理,呈現。處理的方式包括分類、自動摘要 (Automatic Summarization)、和自動標題產生(Automatic Title Generation)等。 本論文除了探討純文字文件下的自動標題產生,另一個研究主題為針對經自動轉寫產生,與有錯誤的文件中做自動標題產生。論文中以語音新聞為例,探討辨識錯誤對自動標題產生的影響。 本論文內容主要分為三大部分:”建造式標題產生法基本型”、”建造式標題產生法之改進及用於自動轉寫有錯誤之文件”,以及”使用不同特徵單位於建造式標題產生法中”。出步實驗顯示若干效能的提升是做得到的。

並列摘要


Title generation is considered important in the near future for easy browsing and retrieving the multimedia document. The natural properties of titles are different from summaries, which make automatic title generation a more challenging task and hence not much improvement has been reported compared with automatic summarization. In this paper an improved non-extractive title generation method is developed. An evaluating data is first summarized and then with Viterbi beam search and various scores learnt form training corpus, we found the output title. Very positive results were btained.

參考文獻


[9] 劉禹吟, “中文文字/語音文件中類專有名詞擷取及其可能應用之初步研究(An Initial Study on Named Entity Extraction from Chinese Text /Spoken Documents and Its Potential Applications),” 碩士論文, 國立台灣大學資訊工程學研究所, 2004.
[3] Shun-Chuan Chen and Lin-Shan Lee, “Automatic Title Generation for Chinese Spoken Documents Using an Adaptive K Nearest-Neighbor approach, ” Proc. EUROSPEECH, 2003.
[5] Stephen Wan, Mark Dras, Cecile Paris and Robert Dale, “Using Thematic Information in Statistical Headline Generation,” Proc. ACL,2003.
[6] Michele Bando, Vibhu Mittal and Michael Witbrock, “Headline Generation Based on Statistical Translation,” Proc. ACL, 2000.
[7] 王建智, “使用各種評分技術自動產生語音文件更佳標題之研究 (Improved Automatic Generation of Titles for Spoken Documents Using Various Scoring Techniques),” 碩士論文, 國立台灣大學資訊工程學研究所, 2006.

延伸閱讀