透過您的圖書館登入
IP:3.21.104.109
  • 學位論文

基於Ceph分散式儲存上資料可靠性之效能評測

Performance evaluation of the reliability based on Ceph distributed storage

指導教授 : 盧永豐 黃慧鳳

摘要


在資料爆炸的時代,資料的大小往往比電腦上的硬碟空間還多,而且資料必須被保留的時間通常也比電腦的壽命還長,使用者可能會從不同的地方存取資料,或著希望能將資料分享給異地的使用者,同時隨著雲端技術的普及、虛擬化技術被大量應用於企業的資料中心,以上的資訊產業變化,現今傳統的升級儲存設備等方式已經無法應付,基於需求,Ceph分散式檔案儲存系統提供了大量的可存放空間,並且在檔案存放的同時進行備份作業,降低存放空間損毀時所造成的資料遺失,由於檔案有多個備份,檔案也可同時分享給多個使用者。隨著軟體定義基礎架構的發展,為企業間帶來了新的解決方式,因此建置隨需軟體定義基礎的系統逐漸受到重視。 Object storage有別於一般的儲存系統,資料的儲存不再以固定的區塊(block)而是以物件(object)的方式儲存。每一個物件除了資料外還有metadata用來描述物件的屬性,儲存物件時不再僅是將區塊存在某個磁區,而是將資料於 metadata 連接成物件後,再將物件分散到實體節點上儲存。 本研究使用Ceph提供的object storage服務加上Erasure編碼[9],透過叢集的概念解決硬體效能不足和可擴充性的問題,並進行不同大小資料量的讀取與寫入實測,對傳統硬碟(Hard Disk Drive,簡稱HDD)和固態硬碟(Solid State Disk、Solid-State Drive,簡稱SSD)進行效能比較,歸納出相關實測資料並找出對應規則,進而提升Ceph的object storage服務效能,嘗試以最低的cost產生出最高效能的結果。

並列摘要


In today's fast-growing amount of information,The size of the data is often bigger than the hard disk space on your computer,further information must be retained for the time usually longer than the life of the computer.Users may access data from different places,or hope to be able to share data in different regions to user.With the universal cloud technology,Virtualization technology is widely used in enterprise data centers.The above information industry changes,the current upgrade traditional storage devices, etc. have been unable to load.Based on the needs, Ceph distributed file storage system can provide a lot of storage space, file storage and backup jobs simultaneously to reduce storage space when the damage caused by data loss due to multiple backup files, files can also be shared simultaneously to multiple users. With the development of software-defined infrastructure for inter-enterprise brings a new solution, and therefore build on-demand software-defined formula based system gradually attention. Object storage is different from the general storage system, data storage is no longer a fixed block (block) but rather the object (object) stored in a manner. Each object in addition to information but also have metadata that describes the properties of objects, no longer is there will only block when a magnetic domain to store items, instead, information on metadata objects connected, the entity node object and then distributed to storage. In this study, we use object storage service provided by Ceph and Erasure Coding, with the concept of cluster to solve hardware insufficient effectiveness and scalability issues, use the different size of data for reading and writing performance measurement, the traditional hard drive (Hard Disk Drive, abbreviated HDD) and solid-state drives (Solid State Disk, Solid-State Drive, called SSD) comparison of performance, summarized the relevant field data and find the corresponding rule, and thus enhance the object storage service performance on Ceph, try the results at the lowest cost to produce the highest performance.

參考文獻


[16] Sushil Bhardwaj, Leena Jain, Sandeep Jain,"CLOUD COMPUTING: A STUDY OF INFRASTRUCTURE AS A SERVICE (IAAS)",in International Journal of Engineering and Information Technology IJEIT 2010, 2(1), pp. 60-63
[24] Richard Jones, Rafael D. Lins,"Garbage Collection: Algorithms for Automatic Dynamic Memory Management"
it – Information Technology 53 (2011)pp. 163-164
[6] Rajkumar Buyya∗, Rodrigo N. Calheiros∗, Jungmin Son∗, Amir Vahid Dastjerdi∗, and Young Yoon,"Software-Defined Cloud Computing: Architectural Elements and Open Challenges",in 19 Feb 2015 (this version, v2)
[9] W. K. Lin, D. M. Chiu, Y. B. Lee,"Erasure Code Replication Revisited",in Peer-to-Peer Computing, 2004. Proceedings. Proceedings. Fourth International Conference on 25-27 Aug. 2004 pp. 90 – 97

延伸閱讀