透過您的圖書館登入
IP:18.224.33.107
  • 學位論文

全基因組選拔訓練族群之決定

Training set determination for genomic selection

指導教授 : 廖振鐸
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在面對一群已知基因型而尚未做外表型調查的候選族群時,我們提出了一個有效的演算法以幫助我們由候選族群中選擇最佳的次族群作為訓練族群(training set),這些被選中的訓練族群會被調查外表型資料並以其基因型和表現型資料建立全基因組選拔(genomic selection, GS) 模型。在本篇研究中,我們考慮全基因組迴歸模式(whole-genome regression model),並以脊迴歸(ridge regression) 來估計GS 模型中分子標記的效應,所配適的GS 模型在育種中會接著被用於計算只有基因型資料之測試族群的育種價估計值(genomic estimated breeding values, GEBV),我們提出一個新的判斷準則用於決定所需的訓練族群,這個準則是由GEBV 與真實外表型值的皮爾生相關係數(Pearson’s correlation coefficient) 所發展而來,在本篇研究中我們使用R 語言來分析一組水稻的資料,由結果顯示,使用我們提出的演算法所選擇的訓練族群相較於隨機選擇訓練族群能夠使所配適的模型具有更高的預測準確性。

並列摘要


For a given candidate set of individuals which have been genotyped but not phenotyped, we develop a highly efficient algorithm to determine an optimal subset from the candidate set. The chosen subset serves as a training set to be phenotyped, and then a genomic selection (GS) model is built based on its resulting phenotype and genotype data. In this study, we typically consider the whole-genome regression model, and adopt ridge regression estimation for marker effects in the GS model. The resulting GS model is then employed to predict genomic estimated breeding values (GEBVs) for a given test set of individuals which have been genotyped only. We propose a new optimality criterion to determine the required training set, which is directly derived from Pearson’s correlation between the GEBVs and phenotypic values of the test set. Pearson’s correlation is the standard measure for prediction accuracy of a GS model. We implement our training set determination algorithm in R language, and illustrate it with a rice genome data set. It is shown that the training set generated from our algorithm can usually achieve a significantly improved prediction accuracy in comparison with a randomly selected training set.

參考文獻


Akdemir, D., Sanchez, J. I., and Jannink, J. L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection Evolution, 47(1):38.
Endelman, J. B. (2011). Ridge regression and other kernels for genomic selection with r package rrblup. The Plant Genome, 4(3):250–255.
Heffner, E. L., Sorrells, M. E., and Jannink, J. L. (2009). Genomic selection for crop improvement. Crop Science, 49(1):1–12.
Isidro, J., Jannink, J. L., Akdemir, D., Poland, J., Heslot, N., and Sorrells, M. E. (2015). Training set optimization under population structure in genomic selection. Theoretical and Applied Genetics, 128(1):145–158.
Laloë, D. (1993). Precision and information in linear models of genetic evaluation. Genetics Selection Evolution, 25(6):557.

延伸閱讀