蛋白質的磷酸化作用會參與許多生物有機體體內的各種反應活動,比如:DNA的複製、基因轉錄、蛋白質轉譯等等。並且,蛋白質的磷酸化作用是細胞內訊號傳遞的關鍵步驟,舉凡細胞生長、代謝、增殖和分化,以及細胞間的溝通等方面都需要細胞內的訊號傳遞。現今的生物實驗,已經可以運用質譜儀去大規模地得到蛋白質磷酸化作用的位置,但是要得到該磷酸化位置是被何種激酶所催化,仍舊需要使用傳統生物實驗的方式,這是需要耗費大量時間、金錢及人力的一件事情,因此,本篇研究即是希望運用電腦大量計算的方式,來做對於蛋白激酶催化位置的辨認,這樣可以節省大量的資源與時間。本篇研究,針對的是蛋白激酶磷酸化位置中,出現鹼性胺基酸(Arginine、Histidine、Lysine)特徵的蛋白激酶做研究,使用六種不同的特徵屬性,分別是HMM、Amino Acid Identity、PSSM、ASA、Disorder和Protein- Protein Interaction,期望可以正確地分辨出擁有類似的蛋白質序列特徵的蛋白激酶。
Protein phosphorylation involves in a lot of biological processes, such as DNA replication, gene transcription, protein translation, and so on. Also, protein phosphorylation plays a critical role in signal transduction, which is associated with the cell growth, cellular metabolism, cell multiplication and cell differentiation, as well as intercellular communication in cells. Mass spectrometry (MS) has been widely used to obtain a large amount of protein phosphorylation sites. However, the catalytic kinases for the MS-identified phosphorylation sites are still unknown, especially for the phosphorylation sites containing similar substrate motifs. Therefore, this study develops a computational method to identify the protein phosphorylation sites which have the motif of basophilic amino acids (Arginine, Histidine, and Lysine). In this work, a total of six features including amino acid sequence, amino acid identity, position-specific scoring matrix (PSSM), accessible surface area (ASA), disorder regions, and protein-protein interaction (PPI) are investigated for correctly identifying the protein phosphorylation sites with similar sequence feature.