透過您的圖書館登入
IP:18.118.24.196
  • 學位論文

由基因表現之時間序列資料探勘基因調控網絡

Mining Time Series Gene Expression Data for Gene regulatory Networks

指導教授 : 李瑞庭

摘要


基因表現之時間序列資料可以用來表現基因調控事件間之因果關係。然而,現有的分析方法如基因網絡模型建構僅採用單一樣本的基因表現時間序列資料來建立基因網絡,其結果之準確率仍有待提升。此外,基因網絡模型建構受限於高計算時間的關係,只能包含小規模的基因數目。因此,我們提出一個有效率的資料探勘方法,用於分析重複樣本之基因表現時間序列資料。我們所提出的方法可以從大規模的基因表現資料中找出重要的調控樣式。所找出的調控樣式可用來產生基因調控規則,而這些規則可以進一步組合成複雜的基因調控網絡。我們所提出的基因調控網絡可以表現出動態基因調控事件間的因果關係及其調控的強度。 首先我們利用模擬的資料來進行所提出方法的效能評估。此外,我們也將此方法應用於實際的人類細胞週期基因表現資料。結果顯示,我們所提出的方法不僅具有效率及擴充性,並且可以提供大規模基因表現資料中更精細且準確的基因調控資訊。

並列摘要


Time series gene expression data can be exploited to reveal causal genetic events. However, current methods of gene network modeling focus on one sample of the dataset, which may suffer from a low recovery rate. Moreover, gene network modeling emphasizes small set of genes because of high computation time. Our proposed approach efficiently mines gene regulatory patterns from large scale of replicate time series datasets. The patterns can be used to generate gene regulatory networks. The regulatory networks reveal the relationships of dynamic causal regulatory events and their regulatory intensities. We first examine our proposed approach with simulated data for performance evaluation. In addition, we also apply our proposed approach to human cell cycle data. The results show that our proposed method is not only efficient and scalable but reveals complex regulatory information among large scale of genes.

參考文獻


[2] Alter, O., Brown, P. and Botstein, D., Singular value decomposition for genome-wide expression data processing and modeling, In Proc. of the National Academy of Sciences of the United States of America, Vol. 97, 2000, pp. 10101–10106.
[3] Alter, O., Brown, P. and Botstein, D., Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms, In Proc. of the National Academy of Sciences of the United States of America, Vol. 100, 2003, pp. 3351–3356.
[4] Amaratunga, D. and Cabrera, J., Exploration and analysis of DNA microarray and protein array data, Wiley series in probability and statistics, New Jersey USA, 2004.
[5] Bar-Joseph, Z., Analyzing time series gene expression data, Bioinformatics, Vol. 20, 2004, pp. 2493–2503.
[7] Chang, C.C., The study of an ordered minimal perfect hashing scheme, Communications of the ACM, Vol. 27, 1984, pp. 384–387.

延伸閱讀