透過您的圖書館登入
IP:3.147.103.202
  • 學位論文

複雜背景教學視訊串列之文字擷取

Text Extraction for Lecture Videos with Complicated Background

指導教授 : 陳淑媛

摘要


由於視訊媒介及網際網路的蓬勃發展與盛行,促使數位學習時代加速來臨,透過此一無遠弗屆的數位學習環境,讓所有使用者可在任何地點、任何時間經由視訊媒介,很容易的擷取各式學習資訊,但是如何提供一完善的檢索系統,讓使用者能夠很方便地、很有效地檢索教學視訊串列,以獲得所需學習資訊,是建構一完善數位學習環境不容忽視的課題。本論文提出針對複雜背景教學視訊串列擷取文字之新方法,以利教學視訊串列之文字關鍵字檢索。 因為教學投影片的背景複雜且多樣化,甚至與文字特性相近,所以設計針對教學視訊串列之前景切割法,以擷取前景文字區塊,是本論文的研究主題之一。一般而言,教學視訊串列的解析度偏低,所以如何提升文字品質,以利後續文字辨識,也是本論文的另一重要課題。 首先,針對教學視訊串列進行場景分析,切割出每張投影片對應之場景影像,並整合成一張主影像,使後續處理在主影像上進行,以降低計算時間。之後,對於每張主影像擷取影像分塊特徵,並依照其特徵值進行時間串列分析,以建置投影片的背景圖像,進而據此提取前景層。最後,將提取出的前景層進行文字品質的提升及二值化,以利後續文字辨識。文字辨識的正確性是評估本論文的依據,實驗證明本論文所提方法確實可行且有效。

並列摘要


In terms of streaming media and internet are used more and more frequently, the era of e-Learning emerges. In the e-learning system, learners can access lecture videos no matter when and where. Thus, it is imperative to provide an effective method to retrieve lecture videos conveniently and friendly. In the thesis, text extraction for lecture videos with complicated background is proposed to facilitate lecture video retrieval using textual keywords. Since background of lecture videos may be rather complicated and fancy, in particular may have textual characteristics, foreground segmentation method is designed to extract texts region. On the other hand, since the resolution of lecture videos is generally low, how to enhance the quality of texts to facilitate the consequent text recognition is the other issue in this thesis. First, temporal analysis of lecture videos is performed to detect slide transitions. The frames corresponding to those frames between slide transitions are then merged into a key frame to represent the slide. The consequent process can then be applied to the key frame only so as to reduce computing time. Second, local features are extracted from block partition of slide-like key frame, based on which background model are generated followed by foreground extraction. Finally, for each text region extracted from foregrounds, quality improvement and adaptive binarization are employed to facilitate consequent optical character recognition. The recognition accuracy rate is used to evaluate the performance of the proposed method and to compare with existing methods. Various experiments prove that the effectiveness and feasibility of our method.

參考文獻


[2] A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” Proc. IFIP Second Working Conf. Visual Database Systems II, pp. 113-127, 1992.
[3] S.X. Ju, M.J. Black, S. Minneman, and D. Kimber, “Summarization of videotaped presentations: automatic analysis of motion and gesture,” IEEE Trans. Circuits and Systems for Video Technology, vol. 8, pp. 686-696, 1998.
[5] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for detecting and classifying scene breaks,” Proc. ACM Int’l Conf. Multimedia, pp.189-200, 1995.
[6] F. Wang, C.W. Ngo, and T.C. Pong, “Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis,” Pattern Recognition, vol. 41, pp. 3257-3269, 2008.
[7] Y.T. Chen, C.S. Chen, C.R. Huang, and Y.P. Hung, “Efficient hierarchical method for background subtraction,” Pattern Recognition, vol. 40, pp. 2706-2715, 2007.

被引用紀錄


張雯婷(2015)。父母衝突與青少年不適應行為之關聯性探討:以親子三角關係為中介變項〔碩士論文,中山醫學大學〕。華藝線上圖書館。https://doi.org/10.6834/CSMU.2015.00115
黃翊嫙(2007)。國中生憂鬱傾向與偏差行為之探討—以苗栗縣偏遠地區某國民中學為例〔碩士論文,亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-0807200916274164
蕭玉潔(2011)。高中青少年知覺父母長期衝突的因應及其影響之質性研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315262912
陳亭瑋(2013)。社會資本與青少年偏差行為:家庭與學校的作用〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-0808201317120200

延伸閱讀