決策者常以多維度(multidimensional)的方式來檢視資料倉儲(data warehouse)內的彙總資料,這種多維度的資料檢視結構稱之為資料方體(data cubes)。在關聯式資料庫(relational databases)中,我們可以把這些資料方體視為是某些視域(views)的集合。為了有效提升資料倉儲系統的彙總查詢效能,將資料方體內相關的視域予以實體化(materialized)是常用的一種方式。然而一旦視域被實體化,系統管理者對其建置與維護的成本就必須要加以考量。由於受到儲存空間的限制,如何選取適當的實體化視域(materialized views),以有效地提升系統查詢效能(降低總查詢回應時間),同時降低維護視域的成本,便成為資料倉儲系統設計的重點。 本研究調查了目前有關選取實體化視域的相關研究,並設計了一個逆向式貪心演算法,在儲存空間的限制下,解決視域選取的問題。有別於之前演算法的觀點,我們從傷害性(damage)的角度去評估實體化視域,除了能夠提供系統管理者不同的實體化視域選取策略,以滿足管理者不同的需求;對某些資料方體,與之前演算法比較,逆向式貪心演算法會有較佳的選取。最後提出本研究演算法與之前演算法的結合應用,使資料方體的視域選取更為完善與周延。
Decision makers often view aggregate data in a data warehouse via multidimensional data cubes. In relational databases, we refer a data cube as a set of views. In order to improve the query performance against the data cube, the common technique used is to materialize some of the views in the data cube. Once a view is chosen to be materialized, the system manager must consider its implementation and maintenance cost. Because of space limit, it is important to select the right set of views in the data cube to materialize that improve query performance and reduce the maintenance cost. In this thesis, we investigate previous works on the selection of materialized views in a data warehouse, and design a backward greedy algorithm which solve the problem of selecting materialized views in data cubes under space constraint. Unlike previous algorithms, we evaluate each view by calculating its damage to the overall performance. In addition to provide a different selection strategy for system managers to satisfy their need, in some cases backward greedy algorithm provides a better view selection than previous algorithms. We also combine our algorithm with previous algorithms to further improve the results of view selection.