透過您的圖書館登入
IP:18.189.22.136
  • 學位論文

應用切片逆迴歸法於直方圖資料之維度縮減與視覺化

Dimension Reduction and Visualization of the Histogram Data Using Sliced Inverse Regression

指導教授 : 吳漢銘

摘要


在象徵性資料分析 (Symbolic data analysis, SDA) 中,直方圖資料是一 個重要的研究主題,主要的研究發展方向是利用主成分分析法 (Principal component analysis, PCA)。在此研究中,我們利用另一個替代的維度縮減 方法逆切片迴歸法 (Sliced inverse regression, SIR) 去降低直方圖資料的維 度。逆切片迴歸法是一個基於切片的充分維度縮減技術使我們可以在低維 度空間中觀察高維度的資料所隱藏的結構與資訊。我們首先考慮直方圖資 料變數的經驗分布去計算象徵性權重共變異數矩陣,接著利用線性組合質 方圖的方法與矩陣視覺化技術去視覺化降維後的直方圖資料。我們會使用 多筆真實資料去評估此方法降維後的判別能力與視覺化方法。

並列摘要


The dimension reduction of the histogram-valued data (histogram data hereafter) is one of the active research topics in symbolic data analysis (SDA). The main thread has been focused on the extensions of the principal component analysis (PCA) though. In this study, we extend the classical sliced inverse regression (SIR), an alternative method to dimension reduction, to the histogram data. SIR is one of the popular sliced-based sufficient dimension reduction techniques for exploring the intrinsic structure of high-dimensional data. We first consider the empirical (joint) density of histogram variables to compute the symbolic weighted covariance-variance matrix. Then a linear combination of histograms rule and the matrix visualization technique are employed to visualize the projections of histograms in the low-dimensional subspace. We evaluate the method for the low- dimensional discriminative and visualization purposes with some applications to real data sets. The comparison with PCA for histogram data is also reported.

參考文獻


statistics of knowledge: symbolic data analysis, Journal of the American
Statistical Association, 98(462), 470-487.
[2] Chen, M., Wang, H., and Qin, Z., (2014). Principal component analysis
for probabilistic symbolic data: a more generic and accurate algorithm.
[4] Ichino, M. (2011). The quantile method for symbolic principal component

延伸閱讀