透過您的圖書館登入
IP:3.134.103.74
  • 期刊

應用切片逆迴歸法於區間型符號資料之維度縮減

The Application of Sliced Inverse Regression for Dimension Reduction of the Interval-Valued Symbolic Data

摘要


運用切片逆迴歸法(sliced inverse regression, SIR)可以找出有效的維度縮減方向來探索高維度資料的內在結構。針對單一反應變數逆迴歸問題,SIR已發展並應用在各種資料型態上,例如:存活資料、時間序列資料、函數型資料及縱向資料等等。本研究中,我們推展SIR方法到區間型符號資料。首先利用頂點法或中心法將區間資料做轉換,再應用SIR於轉換後的資料上。模擬資料分析結果顯示,不同的切片策略會產生不同的維度縮減方向及呈現不同的低維度視覺化結果,因此找出合適的切片策略有助於正確地分析這類型高維度資料所隱含的結構與資訊。故我們進一步採用以群集為基礎的切片逆迴歸法來分析區間型符號資料,並和符號型主成份分析法相比較,評估它們在低維度空間中區別能力及視覺化的表現。

並列摘要


Sliced inverse regression (SIR) was introduced by Li (1991) to find the effective dimension reduction directions for exploring the intrinsic structure of high-dimensional data. For univariate response regression, SIR has been extended and applied to different data types. Examples were the cases of the survival data, the time series data, the functional data and the longitudinal data. This study intends to develop SIR for the interval-valued symbolic data. Firstly, the interval-valued data was transformed into the conventional data matrix using the vertices method or the centers method. Then the classical SIR algorithm was directly applied to the transformed data. The simulation results shown that using different slicing schemes produced different projection directions and different lower-dimensional visualization. Therefore, a suitable slicing scheme is needed for correctly investigating the embedded structure and information of the high-dimensional interval-valued symbolic data in the lower-dimensional plots. The results motivated us to adopt the clustered-based SIR to improve the implementation of the symbolic SIR. We compared and evaluated the results with those obtained with several existing symbolic dimension reduction techniques (such as the symbolic principal component analysis) for discriminative and visualization purposes.

參考文獻


Becker, C.,Fried, R.(2003).Sliced inverse regression for high-dimensional time series.Exploratory Data Analysis in Empirical Research: Proceedings of the 25th Annual Conference of the Gesellschaft fur Klassi ckation.(Exploratory Data Analysis in Empirical Research: Proceedings of the 25th Annual Conference of the Gesellschaft fur Klassi ckation).
Billard, L.,Diday, E.(2006).Symbolic data analysis: conceptual statistics and data mining.John Wiley & Sons.
Billard, L.,Diday, E.(2003).From the statistics of data to the statistics of knowledge: symbolic data analysis.Journal of the American Statistical Association.98,470-487.
Cazes, P.,Douzal-Chouakria, A.,Diday, E.,Schecktman, Y.(1997).Extension de l'analyse en Composantes Principales des donnes de Type Intervalle.Revue Statistique Applique.45,5-24.
Douzal-Chouakria, A.,Diday, E.,Cazes, P.(1998).An improved factorial representation of symbolic objects.Advances in Data Science and Classi cation.(Advances in Data Science and Classi cation).:

延伸閱讀