透過您的圖書館登入
IP:3.145.35.178
  • 學位論文

利用多序列區段增進轉錄體定序資料之 差異表現分析的可靠性

Segment-based quantification of differential expression in RNA-seq data

指導教授 : 陳倩瑜
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來RNA定序(RNA-sequencing)已經成為量測基因表現量的重要技術,然而,現有定序技術存在幾種偏差(bias),這些偏差導致轉錄序列所產生的定序讀段 (read) 在轉錄序列上的分布並非是均勻分布的 (uniform distribution),這會導致轉錄序列的某些轉錄序列區段(region)存在的定序讀段量較高,某些區段較低。此類由於定序技術偏差而造成轉錄序列區段上讀段分布不均的現象,可能大大影響轉錄序列定量的準確性,進而影響基因表現量差異分析的結果。為此,本研究比較五種量度基因表現差異的方法,其中包含以全長轉錄序列(full-length)量度表現差異的方法和以轉錄序列區段(segment-based)量度表現差異的方法,其結果顯示,其中一種以轉錄序列區段計算表現量差異的方法SRA較傳統上以全長轉錄序列計算表現量差異的方法好,尤其是在轉錄序列表現量很低的時候,其定量準確性相對高出很多。因此本研究最後建議,RNA定序的使用者將來若希望將低表現量的基因納入計算的話,可以考慮使用以轉錄序列區段的定量方式(SRA)進行基因表現差異的量測。

並列摘要


RNA sequencing (RNA-seq) technology is an essential tool for investigating transcript (gene) expression and has been widely suggested by many recent studies. However, several potential biases result in the situation that the read sampling is not uniformly distributed in different regions of a transcript. Such position biases might largely affect the accuracy of quantification methods in correctly estimating transcript (gene) expression, and thus is a critical issue to tackle in differential expression analysis of RNA-seq data. In this study, five quantification methods of producing transcript differential scores across two experimental conditions are presented and compared. Differential scores across two experiments were constructed using the full-length transcripts and the segments of each single transcript, respectively. Results revealed that the segment-based method, SRA, can report more consistent transcript differential scores with the estimated scores from microarrays than the full-length approach, especially when the transcript (gene) expression is low. The analyses conducted in this study suggested the RNA-seq users to employ the differential scores integrated from multiple segments for discovering differential genes, especially when the transcript (gene) expression is considerably low.

參考文獻


Ansorge, W.J., 2009. Next-generation DNA sequencing techniques. New biotechnology 25, 195-203.
Bohnert, R., Ratsch, G., 2010. rQuant.web: a tool for RNA-Seq-based transcript quantitation. Nucleic Acids Res 38, W348-W351.
Bradford, J.R., Hey, Y., Yates, T., Li, Y.Y., Pepper, S.D., Miller, C.J., 2010. A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling. Bmc Genomics 11.
Fu, X., Fu, N., Guo, S., Yan, Z., Xu, Y., Hu, H., Menzel, C., Chen, W., Li, Y.X., Zeng, R., Khaitovich, P., 2009. Estimating accuracy of RNA-Seq and microarrays with proteomics. Bmc Genomics 10.
Hansen, K.D., Brenner, S.E., Dudoit, S., 2010. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38.

延伸閱讀