透過您的圖書館登入
IP:52.86.227.103
  • 學位論文

去噪、成分分離及辨識於氣相層析串聯飛行時間質譜儀之工具套件

IDMass: Noise Reduction, Component Extraction, and Identification Processing Toolkit for GC/TOF-MS

指導教授 : 曾宇鳳

摘要


氣相層析串聯飛行時間質譜儀已成為一項代謝體學研究的重要技術。 我們提供了一個新的演算法, IDMass, 它能精確並靈敏的從混和樣本中抽取並辨識個別成分。IDMass主要包含下列五項步驟:去噪(noise reduction)、探測波峰區域(deconvolution window detection)、化合物數量決定(chemical rank determination)、成分抽出(component extraction)及辨識(identification)系統。首先,IDMass的去噪步驟減去了頻譜域的雜訊(mass dimension noise),而經去噪處理的波峰有更佳的波形以提升被正確辨識的潛力。第二,在探測波峰區域時IDMass 不須使用者選定的閾值評估參數,即可藉由算出一個經基線校正的總離子層析譜 (total ion chromatogram) 之閾值並用鄰近的局部最小值來精煉邊界以決定波峰區域。第三,在化合物數量決定中,IDMass 用兩層的局部最大值佐以連續小波轉換選定波峰來分別不同成分的波峰。它能靈敏的偵測頻譜相似的不同成分。第四,成分抽出的的步驟中,IDMass 藉由粒子群最佳化(particle swarm optimization)找出的最佳之指數修正之高斯函數(exponentially modified Gaussian)模型來取出具化合物區段中各分子的頻譜。而該最佳化可自動完成而不須人為的初始參數設定。波形資訊為IDMass成分解析的主要限制條件並能抽出比多元曲線分辨更純的成分。然而,有時會因質譜儀飽和出現不佳的波形,而限制IDMass 的表現,但這種狀況能用樣品稀釋解決。最後,IDMass藉依序的辨識分子,可將複雜的模型與頻譜數值結果整合成一個化合物訊號強度表格便於進一步的統計分析。 IDMass 的表現經含有76種混和標準品的樣本測試,回現率、精確率及F分數分別為 0.92, 0.81和 0.86。IDMass 可成功的對從76標準品混和的樣本辨識出的化合物定量。

並列摘要


Gas chromatography / time of flight mass spectrometer (GC/TOF-MS) has become an important technique for metabolomics. We developed IDMass, a novel algorithm that accurately and sensitively extract and identify the individual components in GC/TOF-MS samples in this study. IDMass comprises five main steps: noise reduction, deconvolution window determination, chemical rank determination, component extraction and identification. First, by subtracting detector noise in mass dimension, resulting peaks generated by IDMass noise reduction step demonstrates to have better shapes and also improve the identification result. Second, IDMass detects peak regions by calculating a threshold of the baseline corrected total ion chromatogram (TIC) and refining the boundaries of the regions by local minimum nearby without manual specified parameters for evaluating threshold. Third, IDMass determines the chemical rank by a two-layer local maximum method with peak picking using continuous wavelet transform to better separate peaks from different components. The chemical rank determining method is able to detect different components with similar spectrum sensitively. Forth, IDMass uses optimal exponentially modified Gaussian (EMG) model with the particle swarm optimization (PSO) to extracts individual components without manual specify the initial value for evaluating the eluted shape. IDMass uses the peak shape information as a major constraint and it is able to extract purer components than multivariate curve resolution (MCR) approaches especially in the case that co-eluted compounds with similar spectrum. However, some eluted peaks with bad shape caused by saturation of the mass spectrometer detector limit performance of IDMass but can be resolved by sample dilution. Last, by identifying compounds sequentially, IDMass can integrate the result into a peak table automatically for further statistical analysis. The performance of IDMass was tested in a data set containing 76 standard mixtures; the recall, precision and F-score were 0.92, 0.81 and 0.86, respectively. IDMass was successfully used to quantify the identified compounds in the 76 standard mixtures.

參考文獻


(29) Duran, A. L.; Yang, J.; Wang, L.; Sumner, L. W. Bioinformatics 2003, 19, 2283-2293.
(70) Kim, S.; Koo, I.; Wei, X. L.; Zhang, X. Bioinformatics 2012, 28, 1158-1163.
(71) Horai, H.; Arita, M.; Kanaya, S.; Nihei, Y.; Ikeda, T.; Suwa, K.; Ojima, Y.; Tanaka, K.; Tanaka, S.; Aoshima, K. Journal of mass spectrometry 2010, 45, 703-714.
(5) Wishart, D. S. Trends in Food Science & Technology 2008, 19, 482-493.
(9) Robertson, D. G.; Reily, M. D. Drug Development Research 2012, 73, 535-546.

延伸閱讀