以聲學特徵相似度改進語音資訊檢索

一般而言，語音資訊檢索會先透過語音辨識，將語料庫中的語音轉換為文字，再對辨識出的文字進行檢索。然而，這樣的系統架構非常仰賴好的語音辨識系統。如果語音辨識的辨識率很低，檢索系統無法根據辨識出的文字來判斷查詢問句所在的語句，則語音資訊檢索的效能會大幅下降。本論文提出兩種以聲學特徵相似度來改進語音資訊檢索的方法：虛擬相關回饋及圖學基礎之重排序；其優點在於以非監督(Unsupervised) 的方法，透過聲學特徵的比對，有效彌補因辨識率低造成的檢索效能下降。在虛擬相關回饋部分，我們定義聲學特徵相似分數，並提出三種虛擬相關語句選擇的方法。在圖學基礎之重排序部分，我們以聲學特徵相似度建立語句關係圖，並套用隨機漫步及修正隨機漫步演算法來重新分配相關分數。我們並結合兩種方法，達到最好的語音檢索效能。在辨識率為62:55% 的辨識系統下，語音資訊檢索的平均準確率從55:54%進步至70:61%，相對進步率為27%。

關鍵字

語音資訊檢索；口語詞彙偵測；虛擬相關回饋

並列摘要

無資料

並列關鍵字

Speech Information Retireval ； Spoken Term Detection ； Pseudo-Relevance Feedback

參考文獻

[38] Timo Mertens, Daniel Schneider, and Joachim Kohler, “Merging search spaces for subword spoken term detection,” in INTERSPEECH, 2009.

[77] Ioannis Arapakis, Joemon M. Jose, and Philip D. Gray, “Affective feedback: an investigation into the role of emotions in the information seeking process,” in

[34] Javier Tejedor, Dong Wang, Simon King, Joe Frankel, and Jose Colas, “A posterior probability-based system hybridisation and combination for spoken term detection,”

[86] S. E. Johnson, P. Jourlin, K. Sparck Jones, and P.C. Woodland, “Spoken document retrieval for trec-9 at cambridge university,” in Proc. TREC-7, 1999.

[3] Dong Wang, Simon King, and Joe Frankel, “Stochastic pronunciation modeling for spoken term detection,” in INTERSPEECH, 2009.

被引用紀錄

朱惠銘（2004）。研究使用詞彙與語意資訊於〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-2004200711335403

國際替代計量

以聲學特徵相似度改進語音資訊檢索

全文下載

主題瀏覽