在Hadoop環境下以知識本體為基礎的資訊系統之案例研究

全球資訊網的快速成長,網路上的資訊與資料量也日益增加,導致使用者常常需要花費大量的時間,針對網路上龐大的資訊進行檢視及過濾,使得網路的效率越來越低,如何在資訊爆炸的時代中,將網路資訊變得更精確、更有效率是一個值得探討的問題。造成上述問題的原因不外乎是因為存放在網路上的龐大資料,現存的電腦軟體並無法自動解讀資料內含的語意,而語意網技術可改善上述的限制,透過知識本體技術來描述資料內含的語意以提升資料的可用性及有效性,但軟體需執行大量的語意解讀與推論,會降低系統的效能是使用知識本體技術的負面影響,本論文使用Hadoop 平台中 MapReduced 平行運算的技術來改善此一限制。知識本體的部分是運用 Jena 推論引擎,Jena 推論引擎可以將使用者提供的資訊,進行推論出新的推論結果,如此一來便可以給予使用者更多符合需求的資訊,並且利用 Hadoop 在處理海量資料上的優異表現,整合其兩者希望可以得到更好的結果。在案例討論中,基於網路社群的創意構想,本論文提出一種基於分享的概念平台,提供給使用者旅遊景點相關的資訊,目的是為了提供給使用者更貼切且真正需要的資訊。經實驗測試顯示本論文所提出的方法可以提高語意解讀與推論的執行效率。

關鍵字

Hadoop ； Jena ；語意網； MapReduce ；知識本體

並列摘要

With the rapid development of the internet, the quantity of information and data on the internet is growing substantially. As a consequence, users are often required to spend a considerable amount of time viewing and filtering infor-mation, which has Reduced the efficiency of the internet significantly. The means of increasing the accuracy and efficiency of information on the internet is therefore an issue worthy of investigation. The cause of the problem stated above lies in the significant amount of information stored in the internet, and existing software programs do not have the capacity to automatically interpret the meaning of the information contents. Semantic Web technologies were cre- ated to overcome this limitation, using ontology techniques to describe the meaning of information content and increase the usability and validity of the in-formation. However, the software programs must execute a large number of se- mantic interpretations and inferences to do so, which impair the performance of the system. In this study, we used the parallel computation techniques of MapReduce in the Hadoop platform to overcome this adverse influence of on- tology techniques. With regard to ontology, we used the Jena inference engine, which can infer new results based on information provided by users. This pro- vides users with more information that meets their needs. We also made use of the superior performance of Hadoop in processing big data, thereby yielding better results. Based on the concept of sharing in the internet community, we proposed a platform that provides users with information on tourist attractions. The objective was to provide users with accurate information that they truly need. The results of our experiment indicate that the proposed method increases the execution efficiency of semantic interpretation and inference.

並列關鍵字

Hadoop ； Jena ； Semantic Web ； MapReduce ； Ontology

參考文獻

[1] Internet 2012 in numbers ,2013,

[2] DATASTAX CORPORATION,2011,Big Data: Beyond the Hype Why Big Data Matters to You

[5] Shao-min Zhang,2010, ”An Approach of Domain Ontology Construction Based on Resource Model and Jena,Information Processing (ISIP) ”, 2010 Third International Symposium on,pp.311-315,Oct. 2010

[9] Kyong-Ho Lee,Yoon-Chul Choy,Sung-Bae Cho ,2004, ”An efficient algo-rithm to compute differences between structured documents,Knowledge and Data Engineering”, IEEE Transactions on,16 , Issue: 8 ,pp.965-979,IEEE,Aug. 2004

[11] Changwoo Byun,2007, ”An Efficient Detection of Conflicting Updates in Valid XML,Computer and Information Technology”, 2007. CIT 2007. 7th IEEE International Conference on,pp.17-22,Oct. 2007

國際替代計量

在Hadoop環境下以知識本體為基礎的資訊系統之案例研究

未授權

主題瀏覽