中文語音資訊摘要－模型與特徵之改進

大量包含聲音與影像的多媒體內容持續增加，並且遍佈於網際網路與我們的日常生活中，如何有系統化及自動化地處理與統整，已成為當前重要的課題之一。其中，語音為多媒體內容中最具有語意的主要內涵之一，通常可用來表示多媒體檔案的主題與概念。近幾年來，有許多學者已投入多媒體內容組織與理解的相關研究，並有豐碩的成果與貢獻，例如語音文件的轉譯、檢索與摘要。文件摘要可分為摘錄式(Extractive)與非摘錄式(Non-extractive or Abstract)摘要，摘錄式摘要依特定摘要比例，從原文件中選出重要的文句、段落或章節來組成摘要；非摘錄式摘要是直接根據文件內容的主題或概念所產生的摘要內容。由於非摘錄式摘要仍具相當的困難度，故目前自動語音文件摘要的相關研究大多以摘錄式摘要為主。本論文主要探討摘錄式中文廣播新聞語音文件摘要方法。我們提出一個機率生成架構，它能將文句生成模型與文句事前機率緊密地耦合，用於摘錄式摘要之重要文句選取。待摘要文件中每一文句被視為一個機率生成式模型，藉以預測文件生成的機率。我們提出二種機率生成模型：隱藏式馬可夫模型(Hidden Markov Model, HMM)與關聯性模型(Relevance Model, RM)的結合，以及詞層次混合模型(Word Topical Mixture Model, wTMM)。同時，我們亦初步將辨識信心度與一些語音聲韻特徵用來作為文句事前機率的估測。我們於中文廣播新聞語料上進行實驗與分析，經由初步的結果證明所提出的方法較其它常見方法可達到更好的摘要結果。

關鍵字

語音文件、摘錄式摘要、隱藏式馬可夫模型、關聯性模型、詞層次主題混合模型

並列摘要

Huge quantities of multimedia contents including audio and video are continuously growing and filling networks and our lives. Speech information is one of the most important sources for multimedia contents, and usually represents the concepts and topics. Hence, in the recent past, several attempts have been made to investigate the possibility of understanding and organization of multimedia content using speech, and substantial efforts and very encouraging results on spoken document transcription, retrieval and summarization have been reported. Spoken document summarization can be either extractive or abstractive. Extractive summarization selects indicative sentences, passages, or paragraphs from an original document according to a target summarization ratio and sequences them to form a summary. Abstractive summarization, on the other hand, produces a concise abstract of a certain length that reflects the key concepts of the document. The latter is more difficult to achieve, thus recent research has focused on the former. In this thesis, we consider extractive summarization of Chinese broadcast news speech. An unified probabilistic generative framework that seamlessly combined the sentence generative probability and the sentence prior probability for sentence ranking was proposed. Each sentence of the spoken documents to be summarized was treated as a probabilistic generative model for predicting the document. To achieve this goal, two alternative approaches, i.e., the hidden Markov model (HMM) that was integrated with the relevance model (RM), and the word topical mixture model (TMM- ), were extensively investigated. On the other hand, the confidence measure and a set of prosodic features were exploited for modeling the sentence prior probability. The summarization capabilities of the proposed approaches were verified by comparison with the other conventional summarization ones. The experiments were performed on the Chinese broadcast news collected in Taiwan. Very promising and encouraging results were initially obtained.

並列關鍵字

spoken documents, extractive summarization, hidden Markov model, relevance model, word topical mixture model

參考文獻

[Wang et al. 2005] H.-M. Wang, B. Chen, J.-W. Kuo, and S.-S. Cheng (2005). “MATBN: A Mandarin Chinese Broadcast News Corpus“, Internation Journal of Computational Linguistics and Chinese Language Processing, Vol. 10, No.2, pp.219-236, June 2005.

[Aubert 2002] X. Aubert, “An Overview of Decoding Techniques for Large Vocabulary Continuous Speech Recognition,” Computer Speech and Language, Vol. 16, pp. 89-114, 2002.

[Baxendale 1958] P. B. Baxendale, “Machine-Made Index for Technical Literature-An Experiment”, IBM Journal (October) pages 354-361, 1958

[Chen et al. 2004] Berlin Chen, Hsin-min Wang, Lin-shan Lee, “A Discriminative HMM/N-Gram-Based Retrieval Approach for Mandarin Spoken Documents,” ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, June 2004, pp. 128-145.

[Chen et al. 2004] B. Chen, J.-W. Kuo, W.-H. Tsai (2004), ”Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription”, in Proc. ICASSP 2004.

被引用紀錄

邱炫盛（2006）。利用主題與位置相關語言模型於中文連續語音辨識〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-0712200716132659

張鈺玫（2010）。使用多種鑑別式模型以及特徵資訊於語音文件摘要之研究〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315212968

國際替代計量

中文語音資訊摘要－模型與特徵之改進

主題瀏覽