透過您的圖書館登入
IP:3.148.250.110
  • 期刊

An Introduction to Multilinear Principal Component Analysis

多重線性主成分分析簡介

摘要


在統計資料分析中,主成分分析(PCA)是一個簡單且廣被使用的方法。它可以藉由共變異矩陣的特徵值分解來達成。傳統PCA處理向量變數,每個觀察值都以向量形式表示。當觀察值是張量(tensor)物件時,例如圖片、影像、EEG訊號、或是基因交互作用等,傳統的PCA首先將這些張量物件向量化,然後對一個大的共變異矩陣進行特徵值分解。這種對張量物件向量化的主成分分析,可能會是困難而且效率差的。主要的原因是,當樣本數比向量化資料的維度小時,主成分分析的估計過程並不穩定。多重線性主成分分析(MPCA)是主成分分析的一種改良。在尋找主成分時,它保留了觀察值本身的張量結構。保留此結構的主要優點在於節省了用以決定主成分的子空間所需的參數,這減輕了高維度帶來的不利影響,因而提高了估計及預測的效率。在本文中,我們對MPCA的基本概念和技巧提供了一個易懂的介紹。從統計的觀點,並根據一些真實資料的應用,讀者可以看到MPCA得以成功的理由。

並列摘要


Principal component analysis (PCA) is a simple and very popular method in statistical data analysis. It can be done by eigenvalue decomposition of covariance matrix. Traditional PCA deals with vector variables and each observation is represented in vector form. When observations are tensor objects, such as images, videos, EEG signals over a spatial domain or gene-gene interactions (as symmetric random matrices), traditional PCA first vectorizes these tensor objects and then proceeds with the eigenvalue decomposition of a large covariance matrix. This vectorized PCA for tensor data can be difficult and inefficient. The main reason is that the estimation process of PCA is unstable when the sample size is small compared to the dimension of the vectorized data. Multilinear principal component analysis (MPCA) is a modification of PCA. It preserves the natural tensor structure of observations in searching for principal components. The main advantage of preserving the tensor structure is the parsimonious usage of parameters in specifying the principal component subspaces, which mitigates the adverse influence of high-dimensionality, and hence, leads to efficiency gain in estimation and prediction. In this article, we provide a user-friendly introduction to the basic concept and technique for MPCA. One will see the rationale for the success of MPCA, from the statistical point of view and based on some real data applications.

參考文獻


Bader, B.W., Kolda, T.G. and others (2012). MATLAB Tensor Toolbox Version 2.5, Available online, http://www.sandia.gov/ ∼ tgkolda/TensorToolbox/
Chen, T.L.,Hsieh, D.N.,Hung, H.,Tu, I.P.,Wu, P.S.,Wu, Y.M.,Chang, W.,Huang, S.Y.(2013).γ-SUP: a clustering algorithm for cryo-electron microscopy images of asymmetric particles.Annals of Applied Statistics.
De Lathauwer, L.,De Moor, B.,Vandewalle, J.(2000).A multilinear singular value decomposition.SIAM J. Matrix Anal. Appl..21,1253-1278.
De Lathauwer, L.,De Moor, B.,Vandewalle, J.(2000).On the best rank-1 and rank-(R1, R2, ..., RN ) approximation of higher-order tensors.SIAM J. Matrix Anal. Appl..21,1324-1342.
Hung, H.,Wu, P.S.,Tu, I.P.,Huang, S.Y.(2012).On multilinear principal component analysis of order-two tensors.Biometrika.99,569-583.

延伸閱讀