Robust Target Speaker Tracking in Broadcast TV Streams

This paper addresses the problem of audio change detection and speaker tracking in broadcast TV streams. A two-pass audio change detection algorithm, which includes detection of the potential change boundaries and refinement, is proposed. Speaker tracking is performed based on the results of speaker change detection. In speaker tracking, Wiener filtering, endpoint detection of pitch, and segmental cepstral feature normalization are applied to obtain a more reliable result. The algorithm has low complexity. Our experiments show that the algorithm achieves very satisfactory results.

並列關鍵字

Speaker Tracking ； Audio Segmentation ； Entropy ； GMM

參考文獻

Ajmera, J.,I. McCowan,H, Bourlard(2003).Speech/music segmentation using entropy and dynamism features in a HMM classification framework.Speech Communication.40(3),351-363.

Google Scholar

Bai, J.,S. Zhang, R. Zheng, S. Zhang,B. Xu(2005).Audio Segmentation and Speaker Detection in Broadcast TV Stream.(In Proc. of 10th International Conference on SPEECH and COMPUTER).

Google Scholar

Beigi, H. S. M.,S. H. Maes,J. S. Sorensen(1998).A Distance Measure Between collections of Distributions and Its Application to Speaker Recognition.(In Proc. of Int. Conf On Acoustic, Speech, and Signal Processing).

Google Scholar

Campbell, J.P.(1997).Speaker Recognition: a Tutorial.(Proceedings of The IEEE).

Google Scholar

Cettolo, M.,M. Federico.(2000).Model Selection Criteria for Acoustic Segmentation.(In Proc. of the ISCA ITRW ASR2000 Automatic Speech Recognition).

Google Scholar

被引用紀錄

Hsieh, Y. M. (2015). 以結構機率重估改進中文句法分析 [doctoral dissertation, National Tsing Hua University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0016-0508201514084771

國際替代計量

Robust Target Speaker Tracking in Broadcast TV Streams

全文下載

主題瀏覽