在本文中,我們提出一個在中文教學影片中使用句子邊界檢測的字幕生成輔助系統。為了獲得字幕,我們使用句子邊界檢測來查找語音部分和靜音部分,然後生成字幕。 第一部分,獲取影片的語音特徵,為我們生成字幕以供後續分析。因此,生成將影片轉換為音訊的.wav文件,並設置採樣頻率的參數。第二部分是通過句子邊界檢測來分割轉換的音訊。然後將這些分割的音訊組合起來限制這些音訊的最小長度。第三部分是通過語音辨識將音訊轉換為文字,通過計算音訊的長度得到時間碼,生成帶有字幕和時間碼的.srt檔案。最後一部分是在python環境中的實驗結果。 本論文研究貢獻如下: 1. 方法簡單:只用音訊中靜音的部分找到語音的部分,作為本論文句子邊界檢測的方法。 2. 減少字幕的工作量:本系統可以產生出帶有時間軸以及字幕的.srt檔案。如果需要應用在實際影片上,只需要修改錯字就可以使用。 3. 準確度:模擬結果顯示此方法找到的時間軸非常的準確。
In this thesis, we propose a subtitle generation assistant system by using sentence boundary detection in Chinese teaching videos. In order to obtain the segments of subtitles, we use sentence boundary detection to find voice parts and silence parts, and then generate subtitles. There are four parts in this thesis. The first part, the speech features of videos for us to generate subtitles for subsequent analysis are obtained. Hence, the .wav files for converting videos to audios are generated and the parameters of sampling rate are set. The second part is to split the last part of converting files by using sentence boundary detection. And then these splitting files are combined to constrain the length of these files. The third part is to make the audio files to words by using speech recognition and the time codes are gotten by counting the length of audio files to generate a .srt file with subtitles and time codes. The last part is the presentation of experimental results by python. The contributions in this thesis are as follows: 1. Easy method: An easy method is proposed to use the silence parts of sentence to find voice parts of sentence to perform sentence boundary detection. 2. Reducing the amount of subtitles work: The system can generate a .srt file with subtitles and time code, so the users just have to modify the wrong words of speech recognition. 3. Accuracy: The simulation results tell us that our method can find time codes of voice parts correctly by using the sentence boundary detection.