透過您的圖書館登入
IP:18.190.217.134
  • 學位論文

成份分離、辨識及統計分析於氣相層析質譜儀之圖形化工具套件

IDMass 2.0: Component Extraction, Identification and Statistics GUI Processing Toolkit for GC-MS

指導教授 : 曾宇鳳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


氣相層析質譜儀已成為一項代謝體學研究的重要技術。分析氣相層析質譜儀資料所需步驟包含了前處理(pre-processing)、波峰探測(peak detection)、去捲積(deconvolution)等等。在多樣本的分析中,樣本訊號的對齊(alignment)、結果的視覺化(result visualization)也是很重要的功能。目前有不少氣相層析質譜儀軟體被開發,但缺乏一套整合所有功能、將結果完整視覺化呈現的套件。我們從先前已提出、整合了各個分析氣相層析質譜儀資料的必要步驟的套件--IDMass,提出了修改與更新,成為新的IDMass 2.0。相對於IDMass 1.0,主要有六個差異。第一,加入了化合物數量填補 (Gap Filling in Chemical Rank)步驟。藉由觀察所有樣本的化合物數量探測(Chemical Rank Detection)結果,此功能會填補少數樣本沒有偵測到的化合物數量。第二,在成分萃取(Component Extration)後加入了成分對齊(Component Alignment)步驟。此成分對齊功能會先對成分進行分群,確認每一群中的成員都有相同的辨識結果。與此同時,修改了成分抽出(Component Extraction)的演算法步驟,在不降低準確程度的情況下,降低了運算所需的時間。在先前的演算法中,每次迭代(Iteration)會一併對層析圖(Chromatogram)以及質譜圖(Mass Spectrum)兩者進行最佳化,然而層析圖及質譜圖有非常多可能的組合情形,會使得程式難以快速將兩者同時最佳化。我們將層析圖以及質譜圖的最佳化步驟分拆以降低可能的組合情形,進而降低所需要的計算時間。此前兩項差異提高了準確程度包含真陽性(True Positive)、偽陰性(False Negative)及偽陽性(False Positive)並且提升了運算速度。第三,加入了圖形化使用者介面(Graphical User Interface, GUI)以及化合物導向呈現(Compound-oriented Viewing),允許使用者可以觀察特定化合物在所有樣本或實驗分群上的表現。第四,針對氣相色譜法-四極濾質器質譜聯用(GC/Q-MS)的資料特性,我們設計了新的化合物數量偵測方法(Chemical Rank Detection)。第五,我們將以上的功能與IDMass 1.0的自動化處理系統整合。使用者可在圖形介面視窗或參數設定文件設定好所有參數後自動處理所有樣本資料。最後,我們將Python構築的圖形化介面、R程式及相關套件整合為一跨平台套裝軟體,使用者在執行安裝程式後即可直接使用,無須額外設定。所有程式碼均以R及Python寫成並使用Rpy2套件構築共同平台,使用者可以任意加入Python或R套件擴充IDMass 2.0。

並列摘要


Gas chromatography coupled with mass spectrometer (GC/MS) has become an important technique for metabolomics. There are some key steps including pre-processing, peak detection, deconvolution for GC/MS data analysis. Currently a number of accessible software are developed, however there are few software can provide both all necessary functions for GC/MS data analyses which are integrated into pipeline and result visualization. In the previous time, IDMass has been developed by us with an automatic pipeline integrated all necessary functions for GC-MS data analysis, and now we modified and improve IDMass 1.0 to become new IDMass 2.0. There are six additional features compare with IDMass 1.0. First, we add gap filling in chemical rank procedure. By comparing the results of chemical rank detection, this procedure can fill the chemical ranks which are not detected in only few samples. Second, we add component alignment procedure after component extraction. This procedure clusters the extracted components and all members in clusters will be identified as the same compound. At the same time, we modified the algorithm of component extraction procedure in IDMass 1.0 to reduce the computation time with the similar accuracy compare with previous version. In previous algorithm, both chromatogram and spectrum optimization were applied for each iterative loop, however there were too many possible combination of chromatograms and spectra to allow the program optimize them fast. We separate the procedures of optimization into two different optimization procedures to reduce the possible combination cases of chromatograms and spectra therefore reduce the computation time. The first two significances improve both computation speed and identification performance including true positive, false negative and false positive. Third, we provide graphical user interface (GUI) and provide compound-oriented viewing to allow users observe or compare the expression level in different experimental groups or all samples of specified compounds. Fourth, we design and implement novel method of chemical rank detection for GC/Q-MS data with low scan rate. Fifth, we integrate these additional procedures and improvements into automated pipeline in IDMass 1.0. Users can process all GC/MS data after setting all related parameters in GUI parameter settings window or parameter configuration files. Finally, we integrated GUI shell written in Python, R procedures and related packages into a software program. Users can use directly after installation with installer without any additional settings. All codes are written in R and Python and there are common platform on these two environment created by Rpy2 package. Users can add any Python or R packages to extend IDMass 2.0.

參考文獻


(1) Luo, B., et al. J Chromatogr A 2007, 1147, 153-64.
(2) Wakayama, M., et al. Anal Chem 2010, 82, 9967-76.
(3) Gerszten, Marc S. Sabatine; Emerson Liu; David A. Morrow; Eric Heller; Robert McCarroll; Roger Wiegand; Gabriel F. Berriz; Frederick P. Roth; Robert E. Circulation 2005, 112, 3868-3875.
(4) Pauling, L., et al. Proc Natl Acad Sci U S A 1971, 68, 2374-6.
(5) Zhou M, Liu Y, Duan Y Clinica Chimica Acta; International Journal of Clinical Chemistry 2012, 413, 1770-1780.

延伸閱讀