巴金森氏症是最常見的神經退化性疾病之一,是由於中腦黑質緻密部分泌多巴胺的神經細胞死亡引起,典型的動作障礙表現症狀包括運動遲緩、靜止性顫抖、僵硬、和步態平衡不穩,在65歲以上人口之盛行率為2-3%。巴金森氏症被認為是由老化、環境因素、以及遺傳易感性之間的複雜相互作用引起。約有百分之十的巴金森氏症患者具有單基因遺傳形式的家族病史。在過去的二十年間,研究者在具孟德爾遺傳型式的少數家族中發現致病基因或易感基因座,為巴金森氏症的發病機制提供了更多的線索。目前巴金森氏症沒有根治或延緩病情惡化的方法,儘管全球不同族群的研究團隊進行了大規模的基因研究,絕大多數巴金森氏患者的基因突變或基因危險因子仍不明。巴金森氏症對社會的負擔日漸增加,迄今還沒有疾病改善療法,因此有必要進一步研究探索巴金森氏症的遺傳病因和分子途徑,而新致病基因的發現可為巴金森氏症的病理學機制提供重要線索,以期為疾病修飾療法提供理論依據。次世代全外顯子定序是尋找單基因遺傳疾病中新穎致病基因的首選方法,本研究分三個階段,旨在運用次世代定序來尋找家族性巴金森氏症的新穎致病基因。 第一階段,我們在一個具有自體顯性遺傳模式,且沒有已知巴金森氏症致病基因突變的台灣三代家族中,利用全外顯子定序技術將所有基因的外顯子都進行定序,再建立運用等位基因頻率、分離分析、和生物資訊學信息的篩選策略,縮小可能導致這個家族罹患巴金森氏症的候選致病基因。我們挑出位於外顯子和剪接位點的變異、移除同義變異、移除等位基因頻率大於5%之多型性、並藉由多個預測軟體預測變異是否造成蛋白質的功能和結構改變、或是剪接位點變異是否影響轉譯。在此家族的三名患者和一名未受影響的成員中,共檢測到累計超過150萬個變體。在刪除未患病的家族成員也帶有的變異後,共有29個變異在此四位成員中與巴金森氏症共分離。因為我們並未將家族中所有的成員都進行全外顯子定序,接著我們運用桑格氏定序檢測整個家族,剩下六個變異與此家族的巴金森氏症共分離。 第二階段,我們藉由桑格氏定序法,在182個無親緣關係的巴金森氏症家族中驗證是否也帶有候選致病變異,其中C10orf120 c.T251C、KIF6 c.G1420A、和PNPLA7 c.C851T也在其他遺傳性巴金森氏症家族中出現。我們運用BLAST比對生物序列的一級結構,檢查目標序列在物種間是否有保守性,並檢驗目標基因的蛋白質功能關聯網絡,看有無牽涉到已知巴金森氏症的分子機轉,以及是否與已知巴金森氏症致病基因有關聯,最後運用Human Protein Atlas確認此基因的是否在神經組織中有表現。同時,我們利用GeneMatcher資料共享平台,尋找世界上是否有其他研究團隊也發現同樣的變異,並確認是否與我們的病人有相同或類似的臨床表現。 第三階段,我們在520個無巴金森氏症臨床表徵的對照組中檢測此三個變異點,並運用Fisher exact test進行檢定。然而,這三種變體均在對照中發現,並且患有巴金森氏症組和對照組之間的等位基因頻率相近,此三位點傾向為不具致病性之罕見多型性。 致病基因的發現可為巴金森氏症的發病機制提供重要訊息。鑑於巴金森氏症的遲發性、患病率、和並非百分之百的外顯率,我們推測巴金森氏症之致病基因在人口數據庫中的等位基因頻率很低。因此,我們在篩選的起始階段,使用較為保守的5%等位基因頻率來篩選候選變異。篩選的後半,我們使用臺灣人體生物資料庫,移除等位基因頻率超過1%之變異,並在家族中採用分離分析來縮小候選變異數量。在驗證候選變異位點致病性的階段,我們在額外的遺傳性巴金森氏症家族中探索候選致病變異,以增加其參與巴金森氏症致病機轉的可能性。然而,本研究建立之篩選流程找到的三個變異位點與罹患巴金森氏症的風險之間,並未找到顯著的關聯性。我們隨後用更嚴謹的等位基因頻率(小於1%)校正了篩選策略並進行共分離分析,其中,RRP8 c.1300C>T (P434S)雖然軟體預測分數落於臨界值(SIFT 0.057,Polyphen2 0.895),但與已知抗衰老的基因SIRT1有蛋白質功能間的交互作用。 我們的研究設計有幾個限制。首先,全外顯子定序可能會遺漏位於非編碼區的重要變異或結構變異。其次,我們的篩選條件刪除了同義變異,但同義變異也可能具有致病性。第三,我們在其他不具親緣關係的遺傳性巴金森氏症家族中尋找候選致病變體,而不是將候選致病變體的所有外顯子都進行定序。第四,如果不是具有百分之百的外顯率的變異,就可能無法經由此篩選流程找到。巴金森氏症的中等外顯率和疾病的多因性讓致病基因的發現變得困難。此外,巴金森氏症患者多在老年發病,一來親代的檢體可能難以取得,二來年輕子代可能尚未發病出現巴金森氏症的臨床表徵,降低了分離分析的可用性。 次世代定序技術可應用於發現較罕見、以及未知的巴金森氏症致病基因。本研究希望建立可信的巴金森氏症致病變異篩選流程,並將篩選流程自動化,未來應可成基因診斷之有力工具,以期對巴金森氏症的致病機轉有更透徹的了解,為早期診斷、生物標記的建立、和藥物發展帶來新的目標。
Parkinson’s disease (PD) , which is caused by the death of dopaminergic neurons in the substantia nigra pars compacta, is one of the most common neurodegenerative disorder affects 2–3% of the global population over the age of 65 years. Cardinal symptoms of PD mainly involve movement, including bradykinesia, resting tremor, rigidity, and postural instability. PD is thought to have a multifactorial etiology, resulting from complex interactions between aging, environmental exposures, and genetic susceptibilities. Approximately 10% of patients suffer from a monogenic form of PD. Over the past two decades, the identification of mutations responsible for familial forms of PD have led to a better understanding of pathogenetic mechanisms. Although numerous genetic studies have been conducted worldwide, the major genetic causes remain unclear for most PD patients. The disease burden of PD is increasing and there is no disease-modifying therapy to date, further studies exploring the genetic etiology and molecular pathways in PD are warranted. Whole exome sequencing (WES) targets protein coding regions of the genome, is the preferred option in finding new causative genetic variants in rare Mendelian disorders. We aimed to identify a novel causative gene for familial Parkinson's disease by WES with next generation sequencing (NGS). Our study was done in three stages. First, we did WES in a Taiwanese family with autosomal dominant inheritance PD. We established a filtering strategy, based on population database allele frequencies, segregation analysis, and bioinformatics information including functional analysis and prediction models, to narrow the number of candidate disease-causal variants. Second, we examed the candidate disease-causal variants in multiple unrelated PD families via Sanger sequencing. Third, we sequenced prioritized disease-causal variants in 520 controls with no signs of PD. Using WES, we detected a cumulative total of over 1.5 million variants in three patients and one unaffected individual in the autosomal dominant inheritance PD family. After removing variants presented in the unaffected family member, our first-step filtering stretegy were satisfied by 29 variants. Among these variants, six variants co-segregate with PD in the family. After screening another 182 families with PD, three variants (C10orf120 c.T251C, KIF6 c.G1420A, and PNPLA7 c.C851T) were found in other probands with familial PD. However, these three variants were all found in neurologically normal controls and the allele frequencies were similar between PD group and control group, favoring nonpathogenic rare polymorphisms. Elucidation of rare alleles with strong effects size can have important implications for understanding of the PD pathogenetic mechanisms. Given the late onset, prevalence, and incomplete penetrance, we expect that PD disease-causal variants are likely observed at low frequencies within population databases. Therefore, we used a conservative 5% minor allele frequency filter in the discovery phase to retain candidate variants. Later, we used a more stringent 1% minor allele frequency filter from the Taiwan Biobank, and segregation analysis to narrow down variants. In the replication phase, we examed variants in additional unrelated familial probands to increase the likelihood that the candidate variants identified in discovery phase are involved in the etiology of PD. However, we did not find any significant associations betwee identified genetic variants and risk of PD. Thus, we corrected our filtering strategy with stringent allele frequency filter (less than 1%). After segregation analysis, RRP8 c.1300C>T (P434S) co-segregate with PD phenotype in the family, but potentially intolerant by SIFT score (0.057) and probably damaging Polyphen2 score (0.895). RRP8 has protein-protein interaction with known anti-aging gene SIRT1. Our study design has several limitations. First, exome sequencing possibly misses important non-coding variants or structural variantions. Second, our filter pipeline excludes sysnonymous variants, but they can also be deleterious. Third, in replication phase, we looked for “identified variants” rather than “all exons in identified genes” in other unrelated familial PD proband. Fourth, our study design aimed to identify variants with fully penetrant effects responsible for strictly mendelian PD. However, the intermediate penetrance and possible genetic interaction with several risk factors make the discovery of PD-causal mutations challenging. Last but not least, the late-in-life onset of PD decreases the availability of obtaining multigenerational pedigrees with genetic samples and phenotype characterization for segregation analysis. NGS technology can be applied to discover rare or unknown PD-causing mutations. This study establish a filtering strategy for identifying disease-causal variants. NGS can be a powerful tool for genetic diagnosis for PD, lead to more understanding of the pathogenesis, and cast light on early diagnosis, biomarkers, and disease-modifyub drug development.