近年來,因數位化時代的到來,許多傳統報章雜誌及紙本文件等資料,亦逐步地數位化以作呈現和保存。然而,數位影像格式眾多繁雜,且容易因軟硬體設備的汰舊更新,而有無法存取舊影像格式之數位資料的問題,因此便發展出利用轉置技術來解決此一問題。我們先前所提之智慧影像分類系統,便透過分析與分類影像內涵,來達到給予最佳化格式轉置。 在此,本論文提出一基於馬可夫隨機場的表格文件擷取系統,分析表格文件內涵且移除其表格特徵,並結合到我們先前所提之智慧影像分類系統,改進其分類效能。從實作成果中可證明,本論文所提出的表格文件擷取系統,不僅可以有效的移除表格特徵,透過本系統與智慧影像分類系統做結合,也能改進智慧影像分類系統在整體數位影像上的分類正確性。
As the digital era is coming, enormous traditional news papers, magazines, and documents, are being digitized for archiving. However, there are so many kinds of digital image formats that are sensitive to data loss caused by old formats that can not be read when software or hardware is updated. Therefore, a lot of works have been presented to adopt the technique of format migration to solve this problem. And we previously proposed an intelligent image classification system to decide the best format for migration by analyzing and classifying image contents. In this thesis, we propose a form document extraction system based on Markov random field, which analyzes form document contents and removes the form features. We integrate this form document extraction system into our intelligent image classification system to improve the classification performance. Experimental results show that our form document extraction system is valid for extracting the form features and improves the whole image classification correctness when we combine our proposed method with the intelligent image classification system.