透過您的圖書館登入
IP:3.138.118.250
  • 學位論文

醣化胜肽分析自動化與鑑定軟體之設計與應用

Computational Tools for Automated Glycopeptide Sequencing and Identification in LC-MSn Based Glycoproteomic Applications

指導教授 : 邱繼輝

摘要


蛋白質的醣基化修飾經常影響其本身的結構、分子穩定度與功能性。因此,能有效率地解析與描繪標的蛋白上特定醣化位點的醣基結構(site-specific glycosylation)之全貌,對於基礎研究上或是生化工業而言都是引頸期盼的目標。然而這方面的需求,並不能藉由普遍使用的質譜分析去醣基化胜肽方法所滿足,其產生的資訊不但隱藏著假陽性(false positive)的風險,也僅能回答蛋白質的被醣化位點的分布。利用質譜直接分析完整醣化胜肽,仍然是目前鑑定特定醣化位點上錯雜的醣基化唯一的方式。然而有效率地從液相層析串聯式質譜儀所產生的龐大數據中萃取並鑑定隱含的醣基化胜肽依舊是個技術上的挑戰,特別是缺乏實用性的電腦軟體更是阻礙這個目標被達成的主因。因此理解到實務上的迫切需求,本論文嘗試開發Sweet-Heart這個電腦軟體,期望能解決在利用串聯式質譜定序醣基化胜肽時所需面對的核心問題。此軟體特別針對低解析與低精確度的二次質譜數據所設計,因此能有效利用離子阱式質譜儀本身的高靈敏度與速度。在策略上,Sweet-Heart能有效的擷取出醣化胜肽圖譜,以結合知識導向之含醣基碎裂離子重新註解(de novo interpretation)與蛋白質資料庫搜尋的方式,進一步以機器學習(machine learning)的策略來權重與排序所有可能的醣基與胜肽組合。此軟體可以自動產生高排名候選標的名單,使方便進行三次串聯質譜(MS3)數據收集,以確定醣基化胜肽上的胜肽序列。並能藉由現有資訊進一步撈出運算過程中可能遺漏的相關醣基化胜肽,呈現於最終的資訊整合報告中。藉此達成一個能有效提供足夠靈敏度與選擇性的平台,以利於挖掘新醣基化的存在,並足以鑑別像N-羥乙酰神經氨酸(N-glycolyl neuraminic acid)和岩藻(Fucose)的組合與N-乙酰神經氨酸(N-acetyl neuraminic acid)和六碳糖(Hexose)的組合之間的質量模糊性。對其計算性能的評估,無論是針對純化的單一醣蛋白或者複雜的醣蛋白質體數據,Sweet-Heart於鑑定個別醣化位點上的醣基結構上皆展現了高度的靈敏度。在此論文中,四種不同含量與複雜度的樣品被拿來應用,包括了人類分泌性表皮生長因子受器、原發性肝癌細胞純化的表皮生長因子受器、老鼠血清蛋白以及分化前後BCL1細胞的膜蛋白樣品。值得一提的是藉由Sweet-Heart軟體的協助,一個新的醣基化位點被肯定的鑑定其存在於原發性肝癌細胞的表皮生長因子受器上,而由蛋白質結構上的探討,此處的醣基化修飾存在著影響此受器活性的可能。此外,本篇論文一併點出了現有醣基化胜肽純化策略的不足,已顯著影響到由此軟體驅動於定性或定量上的鑑定結果。

並列摘要


Protein glycosylation often affects the conformation, stability, and functioning of its carrier protein. The need to efficiently map site-specific glycosylation pattern for both basic research and bio-industry is thus well appreciated. In this respect, the prevalent mode of identifying the de-N-glycosylated peptides by mass spectrometry (MS) is littered with false positives and addresses only the issue of site occupancy. MS analysis of intact glycopeptide remains the only solution to define the diversity of site-specific glycosylation. However, high efficiency identification of intact glycopeptides from a shotgun glycoproteomic LC-MS2 dataset is technically challenging, and particularly handicapped by the lack of enabling computational tools. Realizing these requirements, this thesis aims to develop Sweet-Heart, a computational tool set that attempts to tackle the heart of the problems in MS2 sequencing of glycopeptide. It is specifically designed to accept low resolution and low accuracy MS2 data, so as to capitalize on the high sensitivity and scan speed of ion trap instrument. Sweet-Heart efficiently filters for glycopeptides, couples knowledge-based de novo interpretation of glycosylation-dependent fragmentation pattern with protein database search, and uses machine-learning algorithm to score the computed glyco and peptide combinations. Higher ranking candidates are then compiled into a list of MS2/MS3 entries to drive subsequent rounds of targeted MS3 sequencing of putative peptide backbone, allowing its validation by database search in a fully automated fashion. With additional fishing out of all related glycoforms and final data integration, the platform proves to be sufficiently sensitive and selective, conducive to novel glycosylation discovery, and robust enough to discriminate the mass ambiguity among others, N-glycolyl neuraminic acid/Fucose from N-acetyl neuraminic acid/Hexose. A critical appraisal of its computing performance shows that Sweet-Heart allows high sensitivity comprehensive mapping of site-specific glycosylation for isolated glycoproteins and facilitates analysis of glycoproteomic data. Samples representing different amount and complexity were tested, including human sEGFR, affinity-purified human EGFR from primary lung cancer cells, mouse serum as a secreted proteome and membrane proteome of BCL1 cells during differentiation. A notable discovery was the unambiguous identification of a novel N-glycosylation site on EGFR from some primary lung cells, which may influence its receptor functional activity. Moreover, this work demonstrated the inadequacy of current glycopeptide enrichment approaches, which significantly affected Sweet-Heart-driven glycopeptide identification and in quantification.

參考文獻


[1] Moremen KW, Tiemeyer M, Nairn AV. Vertebrate protein glycosylation: diversity, synthesis and function. Nat Rev Mol Cell Biol. 2012;13:448-62.
[2] Sola RJ, Griebenow K. Glycosylation of therapeutic proteins: an effective strategy to optimize efficacy. BioDrugs. 2010;24:9-21.
[4] Baudys M, Uchio T, Mix D, Wilson D, Kim SW. Physical stabilization of insulin by glycosylation. J Pharm Sci. 1995;84:28-33.
[5] Spivak JL, Hogans BB. The in vivo metabolism of recombinant human erythropoietin in the rat. Blood. 1989;73:90-9.
[6] Elliott S, Egrie J, Browne J, Lorenzini T, Busse L, Rogers N, Ponting I. Control of rHuEPO biological activity: the role of carbohydrate. Exp Hematol. 2004;32:1146-55.

延伸閱讀