記憶體計算(In Memory Computing)提供了新的機會來解決在海量資料裡進行資料處理時會遇到的關鍵效能問題,尤其是現在各種類的非揮發性記憶體開始成為受歡迎的儲存媒介。透過這樣的觀察,本篇論文主要探究如何在快閃記憶體上進行有效率的資料處理,同時也考慮到快閃記憶體的耐久性。本篇論文的第一個部份,我們提出適用於快閃記憶體的索引設計,能夠解決在快閃記憶體上資料處理時的可靠性與效能問題。我們探討了在快閃記憶體上,熱門資料存取、兄弟節點鏈結更新、不同工作量的對於樹索引結構的影響。我們所提出的方法和索引設計,經由一系列的實驗驗證後,在效能和耐久性方面都有顯著的改善。本篇論文的第二個部份,由於快閃記憶體在海量資料應用處理上已經成為具潛力的資料儲存媒介,我們更進一步探討在快閃記憶體裡,如何針對海量資料進行有效率的動態最短路徑搜尋。特別的是,我們提出基於區域性的資料存放策略,採用空間和時間區域性的特性來設計在快閃記憶體裡優化的搜尋方式。同時也提出一種平均抹寫的設計,其中包含了資料重整的機制,來減少效能上的額外負擔與加強快閃記憶體的耐久性。在不同的真實與模擬工作量的實驗底下,我們觀察到所提出的設計有優異的查詢效能和系統耐久性。本篇論文的最後一個部份,我們探討了有效率的系統回復機制,尤其是針對快閃記憶體上的資料庫系統。我們特別感興趣在針對非強制策略與偷取策略底下,基於ARIES方法的資料回復設計,其中ARIES資料回復方法已經廣泛的被許多資料庫系統所採用。我們提出針對快閃記憶體優化的資料記錄與回復設計,來有效率的找出需要重做或不做的資料庫交易,也解決非強制策略與偷取策略所造成的影響。我們同時也提出針對資料庫的靜態平均抹寫設計來有效延長快閃記憶體的使用壽命。我們使用一系列真實與模擬工作量的實驗來驗證所提出方法的效能,從中觀察到所提出的方法對於平均回應時間和系統可靠性皆有很好的效果。
In-memory computing offers opportunities to resolve critical performance issues in data manipulation in the big data era, especially when various kinds of non-volatile memory are emerging as popular storage media. Such an observation motivates this dissertation to explore efficient data manipulation over flash memory with endurance awareness. For the first part of the dissertation, we propose a flash-friendly index designs to resolve reliability and performance concerns for data manipulation over flash memory. We explore the impacts of hot-data access, sibling-link updates, and different workload types to a tree index structure over flash memory. The capability of the proposed methodology and index design was evaluated through a series of experiments, in which significant improvement on endurance and performance were achieved. For the second part of the dissertation, we further extend this dissertation by the exploring of the efficient dynamic shortest path (DSP) search in large graphs over MLC flash memory, where flash memory has emerged as a potential data storage medium for big data application. In particular, we propose a locality-based data placement policy which exploits both spatial and temporal locality for flash-friendly DSP search. A wear-leveling-aware design which contains data reorganization is presented to reduce the performance overhead and to enhance the endurance of flash memory. Excellent query performance and system endurance improvement was shown in the experiments of different realistic and synthetic workloads. This dissertation is concluded by the exploring of efficient system recovery designs especially for databases over flash memory. We are interested in recovery designs with steal and no-force policies over ARIES-based recovery, that is widely adopted in many database systems. We propose logging and recovery designs to efficiently find out the transactions that need to be redone or undone to address the steal and no-force polices. A static wear-leveling strategy for databases is also introduced to efficiently prolong the lifetime of flash memory. A series of experiments was done over realistic and synthetic workloads to evaluate the capability of the proposed design, where we observe excellent results in average response time and good reliability level.