跳躍基因(亦稱為轉座子)是一類DNA序列,它們能夠從染色體DNA上單獨複製或‘跳躍’出來,再而插入另一位點,因而對插入位點上的基因調控造成影響。PiggyBac跳躍基因是一種從粉紋夜蛾 (cabbage looper moth) 基因組中取得的跳躍基因, 其系統已被廣泛應用在各種哺乳動物細胞系中作為基因組操縱的工具, 已在基因組功能研究和誘導多能幹細胞等領域得到了廣泛的應用。PiggyBac系統的主要特徵包括在不同的物種上有高轉效率,有相對低的插入位點偏好,以及跳離基因體時會不留痕跡。而此研究的對象NP-mPB跳躍酶,則是一種針對核仁的PiggyBac跳躍酶 (PBase),可以通過在哺乳類慣用轉譯碼優化之PiggyBac跳躍酶 (mPB) 上添加來自HIV-1的TAT蛋白訊號多肽來建造。在之前的研究發現NP-mPB跳躍酶可以有效的提升跳躍效率。另外,建造NP-mPB的目的是要透過修改mPB,將跳躍基因引導進核仁組織區(nucleolus organizer regions; NORs),因為在NORs中有很多rDNA 的複本,若進入此區域對基因的造成破壞比較可以避免影響其他基因的正常運作,以達到有效的基因治療。本研究的目的是分析次世代定序(next generation sequencing)的數據,,找出實驗中所有NP-mPB與mPB的插入位點,以顯示其在小鼠基因組的分佈,研究是否有偏向於NORs或其它基因區域的插入趨性。
PiggyBac is a popular transposon system used to diver transgenes and explore the unknown genomic territory. PiggyBac transposase (PBase) has been widely applied as a genomic manipulation tool to various mammalian cell lines and model organisms. Major features of the piggyBac system include high transposition efficiency in different species, relatively low insertion site preference, and the ability of seamless removal from genome. These features allow its potential uses in functional genomics in a wide range of organisms, such as plants, cattle, pigs, mice, rats, flies, yeast, and several non-model insects. A novel nucleolus-predominant PBase, NP-mPB, was constructed by adding a nucleolus-predominant (NP) signal peptide from HIV-1 TAT protein to a mammalian codon-optimized PBase (mPB). The initial goal is to create a modified mPB that would increase transposition efficiency and mediate transposition towards the nucleolus organizer regions (NORs), which contains several tandem copies of ribosomal DNA genes. Gene disruption at NORs are believed to be less harmful to the species. This research aims at analyzing raw next generation sequencing (NGS) data of mouse ES cells after being transfected with mPB and NP-mPB. Insertion sites of the two PBs was identified by aligning the processed NGS data to the reference genome. Comparisons of the datasets reveal the transposition preferences of NP-mPB towards NORs and other genome regions.