透過您的圖書館登入
IP:18.217.203.172
  • 學位論文

基於新式經濟評估模型的節能、可靠儲存機制暨相關工具設計用於資料密集典藏系統之研究

Based on a Novel Economic Evaluation Model to Design an Energy-efficient and Reliable Storage Mechanism with Associated Tools for Data-intensive Archive System

指導教授 : 石維寬

摘要


巨量資料是一個愈來愈重要的議題,隨著行動裝置發展、網路技術普及,許多資料都由終端裝置產生經由網路保存至遠端的雲端儲存空間,然而一般的雲端儲存服務提供者除了保存使用者上傳的資料之外,也須要利用一些資料容錯機制,如:獨立磁盤冗餘數組,來產生冗餘資料來提昇資料安全性,以此降低資料毀壞時發生資料無法救回的機率,因此,雲端儲存服務系統的儲存空間通常是由大量的低成本、高耗電的硬碟儲存裝置所組成,根據之前的研究發現,電力成本大約就佔了雲端儲存服務提供者50%的營運成本,進一步分析研究,發現儲存系統就佔了整體雲端儲存電力的27%,因此,綠能資料中心成為一個在設計儲存系統時需考量到非常重要的一個議題。為了降低儲存系統的電力消耗,有許多節能的儲存方式提出,而提出的解決方法可以分成兩大類:1. 動態關閉閒置硬態,當有存取時再喚醒。2. 利用高轉速與低轉速的硬碟來配置儲存系統,根據資料的特性將配置在不同轉速的硬碟。以上兩種方式,第一種方式會造成頻繁的開關硬碟,而造成硬碟的硬體壽命減短進而造成硬體替換成本的提升,然而之前提出相關解法都沒有提供這個問題,另一種方式,雖然會有較好的硬體可靠度,因為不透過硬碟開關來省電,但會有較差的省電效果。 整合以上兩種設計觀點,本研究提出一種新的評估方式(E3SaRC),此方法可以同時考量節能機制在儲存系統上的省電效益以及對於硬體成本的衝擊,在這個方法的評量下,每一個節能儲存系統的設計都會以經濟角度重新被思考,因為設計節能儲存系統不再是只追求省電為目標,而要同時顧慮到省電後所造成硬體成本的衝擊,除了新的評估模型,本論文根據研究的初衷,設計了一個儲存機制(CacheRAID)是同時考量到節能與實作節能系統後對系統造成的成本衝擊議題,此設計利用固態硬碟當系統的寫入緩衝,並根據使用者存取資料的特性來在緩衝的固態硬碟來進行資料關聯性擺放,來產生大量的循序存取,藉此來發揮系統最大存取效能,也能讓系統的閒置硬碟有更多的休息時間,考量到硬碟狀態切換的成本問題,我們也整合了一個硬碟狀態切換控制機制在提出的架構當中,來減少硬體成本的衝擊,最後,為了要提升本研究系統的效能,本論文亦提出了一個模擬工具,此模擬工具可以模擬儲存系統的耗電行為並將節能功能模組化,讓使用者可以在模組中實作自己的節能方法,藉以快速評估其設計方法在儲存系統上的功效,此模擬工具也進階的為我們提出的儲存機制找出最佳設定,以此來提升設計系統的效能與可靠度。 在論文的最後,我們利用了實機的實驗來評估本論文所提出的系統之能力,在實驗的部份,我們執行了兩組實際的儲存系統測資,分別是中央研究院的數位典藏系統與佛羅里達國際大學檔案系統,根據實驗的結果發現,我們所提出的節能儲存機制是唯一一個可以在新的經濟評估模型下可以省下儲存系統成本的方法,其他的比較方法因為沒有考量到硬體成本的衝擊,因此會造成較差的結果,此外,根據實驗結果,本論文的方法也省下了儲存系統65%以上的電力消耗,更進階的,在模擬工具上本論文提出的方法也接近實機的電力消耗,兩相比較,模擬工具的誤差率也只有2.5%左右,根據實驗結果,本論文提出了一個完善的節能系統可以兼顧節能與系統可靠度,足夠降低節能系統對硬體成本的衝擊。

並列摘要


Recently, a green data center issue has garnered much attention due to the dramatic growth of data in every conceivable industry and application. With high network bandwidth, mobile applications and user clients always backups program/user data in remote data centers. In addition to the data from users, a data center usually employs a data fault-tolerance mechanism to generate redundant data, so as to keep user data from getting lost/error. To preserve numerous data in data centers, a storage system consumes about 27%-35% of the power consumption in a typical data center. Reducing the energy consumption of storage systems, previous studies conserved power in their respective storage systems by switching idle disks to standby/sleep modes. According to research conducted by Google and the IDEMA standard, frequently setting the disk status to standby mode will increase the disk's Annual Failure Rate and reduce its lifespan. However, in most cases, the authors did not analyze the reliability of their solutions. To address the issue, we propose an evaluation function called E3SaRC (Economic Evaluation of Energy saving with Reliability Constraint), which comprehensively evaluates the effects of a energy-efficient solution by considering the cost of hardware failure when applying energy saving schemes. With system reliability and energy-efficient considerations, this study proposes an energy-efficient and reliable storage system that is composed of an energy-efficient storage scheme with a data fault-tolerance algorithm, an adaptive simulation tool and a monitor framework. First of all, because power consumption is the most important issue in this dissertation, we developed a data placement mechanism called CacheRAID based on a Redundant Array of Independent Disks (RAID-5) architecture to mitigate the random access problems that implicitly exist in RAID techniques and thereby reduce the energy consumption of RAID disks. On system reliability issue, CacheRAID applies a control mechanism to the spin-down algorithm. To further enhance system energy-efficiency of the proposed system, an adaptive simulation tool has been proposed to find the best system parameters for CacheRAID by quickly simulating the current workload on storage systems. At the end, the contributions of this dissertation are presented in two parts. In the first part, our experimental results show that the proposed storage system can reduce the power consumption of the conventional software RAID 5 system by 65-80%. Moreover, according to the E3SaRC measurement, the overall saved cost of CacheRAID, is the largest among the systems that we compared. Second, the analytical results demonstrate that the measurement error of the proposed simulation tool is 2.5% lower than that achieved in real-world experiments involving energy estimation experiments. Therefore, the proposed tool can accurately simulate the power consumption of a storage system under different system settings. According to the experimental results, the proposed system can significantly reduce storage system power consumption and increase the system reliability.

參考文獻


[1] I. F. Adams, M. W. Storer, and E. L. Miller. Analysis of workload behavior in scientific and historical long-term data repositories. Trans. Storage, 8(2):6:1–6:27, May 2012.
[3] T.-Y. Chen, H.-W.Wei, Y.-J. Chen, T.-S. Hsu, andW.-K. Shih. Base: Benchmark analysis software for energy-efficient solutions in large-scale storage systems. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1–5, Sept 2013.
[4] T.-Y. Chen, H.-W. Wei, T.-T. Yeh, T.-S. Hsu, and W.-K. Shih. An energy-efficient and reliable storage mechanism for data-intensive academic archive systems. Trans. Storage, 11(2):10:1–10:21, Mar. 2015.
[5] T.-Y. Chen, H.-L. Yeh, H.-W. Wei, M.-j. Sun, T.-s. Hsu, and W.-K. Shih. An effective monitoring framework and user interface design. Software: Practice and Experience, pages n/a–n/a, 2014.
[9] S. corporation. Hard drive datasheet. Online, 2014.

延伸閱讀