透過您的圖書館登入
IP:3.145.93.221
  • 學位論文

開發數據非依賴性採集質譜技術應用於跨癌症組織分析

Developing Data-Independent Acquisition Mass Spectrometry for Cross-Cancer Tissue Analysis

指導教授 : 陳玉如

摘要


近年來,質譜及分析軟體的快速進展,加上”數據非依賴性採集(DIA)”技術對所有離子進行全面採集的特性,使得蛋白定量的深度及重複性大大提升。大規模及高品質的圖譜數據庫,對於有效分析DIA數據扮演重要的角色。利用相同儀器所產生的DIA及圖譜數據庫,蛋白定量結果及重複性高。然而,現今尚未有對不同實驗室儀器所產生的DIA及圖譜數據庫做比較。此外,在一定的質譜(MS)及層析(LC)條件下,累積大量的圖譜數據通常需要大量的樣品以及耗時的儀器分析時間。為了改善DIA方法利用圖譜數據庫分析蛋白,所面臨到須先建立大規模圖譜數據庫的問題。在本研究中,我們的目標: (1) 評估不同實驗室產生的數據對DIA蛋白分析的影響;(2) 評估不同樣本組成的圖譜數據庫DIA蛋白分析的影響;(3) 將single-shot DIA 應用於乳癌組織蛋白分析,進行癌症生物學的研究。 在論文的第一部分,我們比較了不同儀器產生的圖譜相似度、流析時間差異性。進一步評估,是否由相同儀器所產生的DIA及圖譜資料庫,其產生圖譜較匹配。根據我們的評估結果,由於圖譜相似度高、流析時間的準確性高,及DIA在兩台儀器建立出的圖譜數據庫的蛋白鑑定結果數量差不多。我們評估由兩台儀器所產生的數據是可以通用的。 為了評估圖譜數據庫的組成對DIA蛋白鑑定的影響,在論文的第二部分,我們使用了三種乳癌細胞,及二十種乳癌腫瘤組織,搭配鹼性逆相層析(high-pH reversed phase fractionation)降低樣品複雜度,高整體蛋白質涵蓋率,利用Orbitrap質譜在數據依賴性採集(DDA)技術下,建立了一個含有10,652個蛋白的乳癌圖譜數據庫。接著,我們將肺癌圖譜數據庫與乳癌圖譜數據庫合併,建立了一個含有12,687蛋白的跨癌圖譜數據庫。 在跨癌症組織的分析中,一個關鍵的問題是,是否單一樣本組成的圖譜數據庫可以應用於不同樣本的DIA分析。在本研究中,我們利用肺癌、乳癌及跨癌的圖譜數據庫,分別去分析肺癌及乳癌的DIA數據。Single-shot DIA在Project-specific library中可鑑定到的蛋白質及癌症標靶蛋白數目最高(肺癌樣本鑑定到7,800種蛋白質,乳癌樣本鑑定到5,800種蛋白質),顯示了圖譜數據庫含有與DIA相同樣本組成的重要性。另一方面,DIA利用跨癌症圖譜數據庫與Project-specific library相比,由於跨癌症圖譜數據庫的圖譜品質較高,因此所定量出的蛋白質,其胜肽序列涵蓋率較高,且蛋白質可信度也較高。 在論文的第三部分,我們實際利用single-shot DIA搭配跨癌症圖譜數據庫,針對8位來自四種亞型的乳癌病人組織,進行蛋白質表現量的分群分析。在這8組DIA中,總共鑑定了7,032種蛋白質,平均每個組織鑑定6,047種蛋白質(定量到4,531種低於CV 20%的蛋白質。在8位患者之間進行multiple t-test,發現492種表量顯著差異的蛋白(FDR<0.01)。通過無監督的階層式分群,我們證明利用蛋白質表現量,可以成功將病人分出與傳統乳癌亞型一樣的結果。接著,通過使用DAVID數據庫,進行路徑富集分析,所有富集路徑(FDR<0.05)可歸類出3組與癌症相關的類別: (1) 細胞增殖 (剪接體途徑,RNA轉運途徑等等)。(2) 誘導血管生成 (補體和凝血級聯途徑,蛋白體途徑等等)。(3) 代謝 (代謝途徑,碳代謝途徑等等)。 總結,本研究提供了基於圖譜數據庫分析DIA策略的評估,包括儀器影響及樣品組成影響。我們的結果說明,使用相同類型儀器的可行性以及使用組成感興趣樣品的跨癌症圖譜數據庫以鑑定更多更廣蛋白質的重要性。

並列摘要


The feature of parallel fragmentation of unlimited precursors by data-independent acquisition mass spectrometry (DIA-MS) and its recent advances in instrumentation and informatics tools have enhanced deep coverage and reproducibility in quantitative proteomics. Comprehensive and quality library plays an important role in efficient signal extraction of DIA data. One critical question is whether a library built upon one cancer type can be applied for data extraction from DIA for another cancer type However, few technical issues remain to be studied to evaluate the general utility of a spectra library. First, the spectral library analysis has shown high reproducibility in the same instrument; this has not been evaluated for instruments in different labs. In addition, accumulating large spectra libraries often requires large amounts of sufficient samples of interest and time-consuming peptide fractionation under identical LC-MS/MS conditions. To improve the issue in DIA-MS method approach on deep proteome analysis, in this study, we aimed to (1) evaluating the performance of DIA-MS across different laboratories; (2) evaluating the effect of library composition on DIA-MS by constructing breast cancer library and combined breast and lung (termed cross-cancer) library. Finally, the single-shot DIA will be applied to breast cancer tissues proteome analysis to study cancer biology. In the first part of the thesis, we compared the spectra similarity, chromatography information of spectral libraries constructed from different instruments. The results showed that data transfer between two instruments is feasible due to the high spectra similarity (98% of peptides defined by over 50% similarity), high accuracy of calibrated retention time and the slight difference in the number of protein identification by DIA-MS. To evaluate the effect of spectral library composition on protein identification in DIA-MS, in the second part of the thesis, we constructed a breast cancer library using three breast cancer cell lines and pooled breast cancer tumor tissues from twenty patients. By using high pH reversed-phase peptide fractionation followed by Orbitrap Fusion Lumos MS data acquisition, we were able to build a breast cancer library with 10,652 protein groups. Next, by merging 12,375proteins in our previously established lung cancer library, we constructed the cross-cancer library (n=12,687 proteins) from lung cancer and breast cancer library. Next, we tested whether single-shot DIA data obtained from breast cancer tissue and lung cancer tissue can search our previously built lung cancer library, breast cancer library and cross-cancer library. The results show that the highest protein numbers (7,800 proteins of lung cancer sample and 5,800 proteins of breast cancer sample) was achieved in the project-specific library. Suggesting the importance of sample-dependent protein composition in spectral library. On the other hand, using a cross-cancer library on DIA-MS shows better spectrum quality compared to project-specific library, leading to more protein sequence coverage of and higher confidence score of identified proteins. In the third part of the thesis, we applied single-shot DIA to study proteome profiling of the four subtypes of breast cancer from 8 patients. In total, 7,032 proteins were identified. Among them, on average 6,047 proteins identified per tissue (4,531 proteins quantified below CV 20%). After multiple t-test between 8 patients, 492 proteins were found to be significantly differentially expressed (FDR<0.01). By unsupervised hierarchical clustering, we were able to demonstrate proteotype-based classification can distinguish individual tumor tissue into conventional subtype. By pathway enrichment using DAVID database, the enriched pathways (FDR<0.05) can be categorized to 3 groups related to cancer hallmarks: (1) cell proliferation (spliceosome, RNA transport pathway…). (2) Inducing angiogenesis (complement and coagulation cascades, proteosome pathway…). (3) Metabolism (metabolic pathways, carbon metabolism…). Overall, this study provides the effects, including instruments and sample effect on the DIA strategy using spectral library To achieve deep proteome coverage, our result indicated the feasibility of using the same type of instrument and the importance of using a comprehensive spectral library which compose of proteome from samples of interest.

參考文獻


1. Griffin, T.J., D.R. Goodlett, and R. Aebersold, Advances in proteome analysis by mass spectrometry. Curr Opin Biotechnol, 2001. 12(6): p. 607-12.
2. Nukala, S.B., et al., Mass Spectrometry-based Label-free Quantitative Proteomics To Study the Effect of 3PO Drug at Cellular Level. ACS Medicinal Chemistry Letters, 2019. 10(4): p. 577-583.
3. Duncan, M.W. and S.W. Hunsucker, Proteomics as a tool for clinically relevant biomarker discovery and validation. Exp Biol Med (Maywood), 2005. 230(11): p. 808-17.
4. White, F.M., Quantitative phosphoproteomic analysis of signaling network dynamics. Curr Opin Biotechnol, 2008. 19(4): p. 404-9.
5. Frei, A.P., et al., Direct identification of ligand-receptor interactions on living cells and tissues. Nat Biotechnol, 2012. 30(10): p. 997-1001.

延伸閱讀