透過您的圖書館登入
IP:3.144.189.177
  • 學位論文

發展一嶄新之比較基因體雜合微陣列正規化演算法

Development of a Normalization Algorithm for Array Comparative Genomic Hybridization

指導教授 : 莊曜宇
共同指導教授 : 陳一東(Yidong Chen)

摘要


基因體的變異是腫瘤生成發展的主因之一。已經有許多的研究證明DNA序列拷貝數的異常對癌症致病是有重大的相關性。比較基因體雜合微陣列(array CGH)是依據基因表現的微陣列晶片的技術所研發,其可以以高解析度找出染色體上序列拷貝數的變異。然而,由於array CGH先天上的特性,許多針對基因表現的資料所使用的分析工具,如資料正規化演算法,通常無法得到令人滿意的結果。在此我們闡述一個新的array CGH正規化演算法,其可以利用在array CGH實驗中,染色體上相鄰位置探針的相依性來提供精準的array CGH資料的正規化。 為了驗證此正規化演算法的表現,我們也利用隱馬爾可夫模型(HMM)來發展一套模擬系統,其可以模擬出有隨機DNA序列拷貝數變化的array CGH實驗的資料組。另外,我們也將我們的演算法去對CL1-0, CL1-1和CL1-5這三種細胞株的array CGH實驗資料作正規化來比較之間的結果。 CL1-0, CL1-1和CL1-5是依據不同的侵入性作分類,之間關係極為接近的肺癌細胞株。經由正規化後,不只使資料的品質顯著的改善,也強化了實驗結果的可靠度。藉由這個新發展的演算法,正規化後的資料呈現顯著的DNA序列拷貝數變化。最後,以此演算法為基礎,我們未來也將建立一個對使用者友善的線上系統來提供方便的array CGH資料的分析。

並列摘要


Genomic instability is one of fundamental factors in tumorigenesis and tumor progression. Many studies have shown that copy-number abnormalities at the DNA level are important in the pathogenesis of cancer. Array Comparative Genomic Hybridization (array CGH), developed based on expression microarray technology, can reveal the chromosomal aberrations in segmental copies at a high-resolution. However, due to the nature of array CGH, many standard expression data processing tools, such as data normalization, often failed to yield satisfactory results. We demonstrate a novel array CGH normalization algorithm, which provides an accurate array CGH data normalization by utilizing the dependency of neighboring probe measurements in array CGH experiments. To facilitate the study, we have developed a Hidden Markov Model (HMM) to simulate a series of array CGH experiments with random DNA copy number alterations that can be used to validate the performance of our normalization. In addition, we applied our algorithm to normalize real data from an array CGH study of CL1-0, CL1-1 and CL1-5 cell lines. CL1-0, CL1-1 and CL1-5 are closely related lung cancer cell lines which are classified according to their differential invasiveness. The normalization made significant improvement over data quality and enhanced the reliability of experimental results. By using this newly developed algorithm, the normalized data showed distinct patterns of DNA copy number alternations among those lung cancer cell lines. Finally, based on this new development; we are establishing a user-friendly web-based system to provide convenient online array CGH data analysis.

參考文獻


[1] Lengauer,C., Kinzler,K.W. and Vogelstein,B.:Genetic instabilities in human cancers. Nature, 396, 1998, 643–649.
[2] Weil R. Lai, Mark D. Johnson, Raju Kucherlapati and Peter J. Park: Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics Vol. 21 no. 19 2005, 3763–3770.
[3] John Quackenbush: Microarray data normalization and transformation. Nature Genetics, Vol. 32, 2002, 496 – 501.
[4] Bolstad B, Irizarray R, Astrand M. and Speed T: A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics, 19, 2003, 185–193.
[5] Alvin W. Moore, Jr., and James W. Jorgenson: Median Filtering for Removal of Low-Frequency Background Drift. Anal. Chem. 65, 1993, 188-191.

延伸閱讀