透過您的圖書館登入
IP:3.144.251.72
  • 學位論文

以虛擬標記描述子為目標輸出之可擴展標記語言的語法分析器之硬體加速

Hardware Accelerated XML Parser for Virtual Token Descriptor

指導教授 : 王勝德

摘要


在2008年時,可擴展標記語言(XML),由於其可擴展的特性,已經在全世界廣泛地使用,作為不同的電腦軟體間,最常用的文件交換標準。電腦工業界預測,可擴展標記語言將會在未來的數十年間,被更廣泛地採用。然而,也由於可擴展標記語言對於人眼而言,非常可讀且清楚的特性,電腦要對其作語法解析,會消耗大量的時間和記憶體。 虛擬標記描述子(VTD)是種處理可擴展標記語言的新方法。它最大的特色在於,它是種非抽取式的處理方法:它不會將文件中的資料自檔案中取出,以建構自身需要的資料結構。虛擬標記描述子僅僅記錄文件中,部分的後設資訊,例如一些標記在檔案中的位址。它可以幾乎媲美文件物件模型的處理能力,但使用較少的運算資源。 本文提出一個以虛擬標記描述子為目標輸出之可擴展標記語言的語法分析器之硬體實現,並且針對我們設計該高速語法分析器之技術做分析與討論。該語法分析器,是受限於硬體而特別設計的:它無法偵測文件中,本身的文法錯誤,但是它可以用非常快的速度做語法分析。硬體合成的數據和實驗結果顯示,在平均的情況下,這個硬體的語法分析器能以每秒處理三十億位元的速度,來分析XML文件。

並列摘要


At the year 2008, XML (Extensible Markup Language) has been globally used as the most common and standard exchange format between different software because of its extensible characteristic. The computer industry predicts that XML will be used more and more in the future decades. However, because the format of XML is very clear and understandable to human, to parse XML costs a lot of time or memory. Virtual Token Descriptor (VTD) is a new method of processing XML. Its most special characteristic is that it is a non-extractive method: it does not extract the data from original XML file to build its own data structure. VTD only records certain important meta-information such as the offsets of some tags. It can achieve almost all the functionalities of DOM, but with fewer resources. This thesis presents a hardware implementation of the VTD XML parser, and discuss about the mechanisms that we use to create a high speed VTD XML parser. This parser is specially designed because of the hardware limitation: it cannot detect the errors inside the XML document, but it is capable of doing high-speed parsing. The synthesis data and experimental results show that this parser can process XML document at a speed of 3Gbs on average case.

並列關鍵字

Parser XML VTD hardware indexer UTF-8

參考文獻


[3] Document Object Model, http://www.w3.org/dom/.
[4] Simple API for XML, http://www.saxproject.org/.
[6] XML Path Language Version 1.0, http://www.w3.org/tr/xpath.
[7] M. J. L. N. Abu-Ghazaleh. Di erential Deserialization for Optimized SOAP
[9] W. L. A. S. K. Chiu, T. Devadithya. A Binary XML for Scienti c Applications.

延伸閱讀