透過您的圖書館登入
IP:18.220.184.6
  • 期刊

PageFile: The Return of Classical Page Storage Structure on MapReduce Framework

摘要


The MapReduce framework has been applied on many researches and proved to be distinguished for processing large scale of data in the last decade. However, it was mostly used to manage unstructured and semi-structured data, and abandoned the classical page storage structure in many MapReduce-based database systems. Therefore, current MapReduce systems didn't take much care on massive structured data. In this paper, we proposed PageFile, a hybrid page-based storage structure on MapReduce framework. It has faster query processing, better disk space utility compared to Hive's RCFile. Moreover, we created a "multiple reduced B-Trees" structure based on PageFile, which performs excellent on single column values or small-ranged queries.

延伸閱讀