透過您的圖書館登入
IP:18.217.110.0
  • 學位論文

開發以次世代定序為基礎的殺手細胞類免疫球蛋白受體(KIR)基因型鑑定之生物資訊分析流程

Development of a killer-cell immunoglobulin-like receptor (KIR) genotyping bioinformatics pipeline using next-generation sequencing data

指導教授 : 楊雅倩
本文將於2027/02/10開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


自然殺手細胞(Nature killer cell, NK cell)在先天免疫系統扮演重要的角色,同時也是對抗病毒感染與腫瘤的第一道防線。殺手細胞類免疫球蛋白受體(Killer-cell immunoglobulin-like receptor, KIR)主要表現於NK細胞,其透過與特定之第一型人類白血球抗原(HLA class I )結合以調控NK細胞免疫反應的活化與抑制。KIR家族坐落於第19號染色體長臂13.4,由15個功能性基因及2個偽基因組成。由於KIR具高度基因多型性及基因套數變異,導致其在使用傳統基因型鑑定方法(如PCR-SSP與qPCR)進行鑑定時較耗費時間及金錢;因此,建立以次世代定序為基礎的KIR 基因型鑑定和生物資訊分析流程有其重要性。本論文先利用PCR-SSP之KIR基因鑑定平台檢測源自臺灣人體生物資料庫的80例DNA樣本,將其結果做為參考基因型,同時,進行全基因體定序(WGS)生物資訊分析,首先使用Norman等人於2016年開發之分析軟體Pushing Immunogenetics to the Next Generation (PING),將其結果與PCR-SSP結果相比較,於1185個基因鑑定(15 loci x 79 cases),出現52個結果(4.39%)不相符,而所有WGS資料皆無法得到KIR等位基因鑑定結果。此外,PING之基因型鑑定無法區分KIR2DL2和KIR2DL3,而KIR2DL2/3的一致率只有65.8% (52/79)。遂利用PING_extractor篩選KIR區域之序列片段(reads),再搭配自行建立的生物資訊分析流程(gKIR)進行KIR基因型鑑定,於1280個基因鑑定(16 loci x 80 cases)結果,其與PCR-SSP結果不相符者減少至2個(0.16%),同時本分析流程可分別鑑定KIR2DL2和KIR2DL3基因,並且一致率皆達100%。 另一方面,以Human Genome Structural Variation Consortium (HGSVC)釋出之HG00731(父)、HG00732(母)和HG00733 (子)家族全基因體序列組裝(Whole-genome assembly, WGA)資料進行KIR基因型鑑定。以三者之WGA分別做為參考序列,使用Minimap2軟體將IPD-KIR公告之1532個KIR alleles 的編碼區(coding region)序列回貼,判斷該標準品攜有之等位基因,並透過家族三人資訊的互相比對做進一步確認。透過此分析流程,HG00733標準品除了位於KIR區域之重組熱點(recombination hotspot)的KIR2DL4及其相鄰基因KIR3DL1外,其他存在之KIR基因皆可確認其等位基因型。利用此分析流程進一步取得其他標準品之KIR單倍型別標準結果,即可做為建立NGS-based KIR基因鑑定平台之確效標準品。

並列摘要


Natural killer (NK) cells play an important role in the innate immune system and act as a first-line defender against viral infection and tumors. Killer-cell immunoglobulin-like receptors (KIRs) are principally expressed on the surface of NK cells, where they interact with HLA class I ligands, and then regulate NK cell response by activating and inhibitory KIRs. The KIR gene family is composed of 15 functional genes and 2 pseudogenes at chromosome 19q13.4. KIR region is characterized by high gene copy number variation (CNV) and allele diversity, leading to high cost and time consuming for genotyping by using conventional genetic testing such as PCR-SSP and qPCR. Therefore, KIR genotyping via a bioinformatics pipeline to deal with available NGS data could be promising and useful. In the study, 80 cases DNA samples and whole genome sequencing (WGS) data were applied from the community-based cohort of Taiwan biobank. KIR genotyping was first performed by using a CE-marked PCR-SSP kit, which can distinguish all of 17 KIR genes. The genotypes via PCR-SSP were then used as benchmarks for the comparison of the results derived from bioinformatics pipelines processing WGS data. The PING (Pushing Immunogenetics to the Next Generation) pipeline, published by Norman et. al in 2016, is designed to determine 15 KIR loci, copy numbers, and alleles. However, as to the valid 79 cases, there were no outputs of allele calls, and the genotype results showed a discordant rate of 4.39% (52/1185) with PCR-SSP results. Moreover, gene calling step of the PING pipeline could not distinguish 2DL2 and 2DL3, and the concordant rate of 2DL2/3 only reach 65.8% (52/79), indicating that there might be limitations in PING pipeline using low-coverage WGS data. Therefore, KIR reads extracted from WGS data by PING_extractor were used for further analysis via a self-built pipeline (gKIR) that can determine 16 KIR loci. All of 80 cases were valid through the gKIR pipeline. The KIR genotypes exhibited a better performance with a lower discordant rate of 0.16% (2/1280) as compared with PCR-SSP results. Meanwhile, gKIR could determine the CNV of KIR genes. On the other hand, we analyzed the KIR allele distribution of standard samples using the published whole-genome assembly (WGA) data of a family trio (the Puerto Rican trio HG00731, HG00732, and HG00733) obtained from the Human Genome Structural Variation Consortium (HGSVC). Coding sequences of 1532 KIR alleles announced by IPD-KIR were then aligned to the WGA using Minimap2. Through finding perfectly matched alleles, we could determine the alleles carried by each sample. With parental data (HG00731 and HG00732), we further got two highly consistent KIR haplotypes at the allele level for HG00733 except for KIR2DL4 and KIR3DL1, which locate close to the recombination hotspot site of KIR region. The determined KIR genotype and allele information for standard samples could be useful to validate gKIR pipeline with the standard samples.

參考文獻


Zinkernagel, R.M. and P.C. Doherty, Restriction of in vitro T cell-mediated cytotoxicity in lymphocytic choriomeningitis within a syngeneic or semiallogeneic system. Nature, 1974. 248(5450): p. 701-2.
Campbell, K.S. and A.K. Purdy, Structure/function of human killer cell immunoglobulin-like receptors: lessons from polymorphisms, evolution, crystal structures and mutations. Immunology, 2011. 132(3): p. 315-25.
Umemura, T., et al., KIR/HLA Genotypes Confer Susceptibility and Progression in Patients with Autoimmune Hepatitis. JHEP Reports, 2019. 1.
Zipperlen, K., et al., Protective genotypes in HIV infection reflect superior function of KIR3DS1+ over KIR3DL1+ CD8+ T cells. Immunol Cell Biol, 2015. 93(1): p. 67-76.
Araujo, P., et al., KIR and a specific HLA-C gene are associated with susceptibility and resistance to hepatitis B virus infection in a Brazilian population. Cell Mol Immunol, 2014. 11(6): p. 609-12.

延伸閱讀