HTML (Hyper-Text Markup Language)是網際網路上網頁呈現的語言,在HTML中,一個標記(tag)所包覆的範圍內可以有其他的HTML標記,構成巢狀的標記結構。網頁製作者使用所見即所得的網頁編輯工具即可建立網頁,但是,這些工具所產生的HTML檔案,在經過反覆地對HTML本文做格式化設定後,往往會在巢狀結構中不同深度的位置,產生相同的格式化設定,無形中會增加HTML的標記,增加檔案的大小,降低傳輸效率。本論文提出一個動態規劃(Dynamic Programming)演算法做巢狀格式的最佳化,使最佳化後的網頁與未最佳化網頁的HTML語意是相同的,但卻能減少HTML標記,達到縮小檔案大小的目的。
HTML (Hyper-Text Markup Language) is the language to present a web page on the internet. In HTML, an HTML tag can contain some other HTML tags. This is called a nested HTML formatting Structure. A web page can be easily built by using a WYSIWYG (What You See Is What You Get) HTML Editor. However, when the format of the web page is edited repeatedly, redundant formatting tags often appear in different depths of a nested HTML formatting structure. This will increase the number of HTML tags, leading to a larger file size and lower transfer efficiency. In this thesis, we propose a dynamic programming algorithm to optimize nested HTML formatting structures. Our optimization algorithm decreases the size of an HTML file without changing its semantics.