全基因組選拔訓練族群之決定

在面對一群已知基因型而尚未做外表型調查的候選族群時，我們提出了一個有效的演算法以幫助我們由候選族群中選擇最佳的次族群作為訓練族群(training set)，這些被選中的訓練族群會被調查外表型資料並以其基因型和表現型資料建立全基因組選拔(genomic selection, GS) 模型。在本篇研究中，我們考慮全基因組迴歸模式(whole-genome regression model)，並以脊迴歸(ridge regression) 來估計GS 模型中分子標記的效應，所配適的GS 模型在育種中會接著被用於計算只有基因型資料之測試族群的育種價估計值(genomic estimated breeding values, GEBV)，我們提出一個新的判斷準則用於決定所需的訓練族群，這個準則是由GEBV 與真實外表型值的皮爾生相關係數(Pearson’s correlation coefficient) 所發展而來，在本篇研究中我們使用R 語言來分析一組水稻的資料，由結果顯示，使用我們提出的演算法所選擇的訓練族群相較於隨機選擇訓練族群能夠使所配適的模型具有更高的預測準確性。

關鍵字

基因組育種值；基因組預測；植物育種；預測準確性；單一核?酸多型性分子標記

並列摘要

For a given candidate set of individuals which have been genotyped but not phenotyped, we develop a highly efficient algorithm to determine an optimal subset from the candidate set. The chosen subset serves as a training set to be phenotyped, and then a genomic selection (GS) model is built based on its resulting phenotype and genotype data. In this study, we typically consider the whole-genome regression model, and adopt ridge regression estimation for marker effects in the GS model. The resulting GS model is then employed to predict genomic estimated breeding values (GEBVs) for a given test set of individuals which have been genotyped only. We propose a new optimality criterion to determine the required training set, which is directly derived from Pearson’s correlation between the GEBVs and phenotypic values of the test set. Pearson’s correlation is the standard measure for prediction accuracy of a GS model. We implement our training set determination algorithm in R language, and illustrate it with a rice genome data set. It is shown that the training set generated from our algorithm can usually achieve a significantly improved prediction accuracy in comparison with a randomly selected training set.

並列關鍵字

GEBV ； genomic prediction ； plant breeding ； prediction accuracy ； SNP marker

參考文獻

Akdemir, D., Sanchez, J. I., and Jannink, J. L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection Evolution, 47(1):38.

Google Scholar

Endelman, J. B. (2011). Ridge regression and other kernels for genomic selection with r package rrblup. The Plant Genome, 4(3):250–255.

Google Scholar

Heffner, E. L., Sorrells, M. E., and Jannink, J. L. (2009). Genomic selection for crop improvement. Crop Science, 49(1):1–12.

Google Scholar

Isidro, J., Jannink, J. L., Akdemir, D., Poland, J., Heslot, N., and Sorrells, M. E. (2015). Training set optimization under population structure in genomic selection. Theoretical and Applied Genetics, 128(1):145–158.

Google Scholar

Laloë, D. (1993). Precision and information in linear models of genetic evaluation. Genetics Selection Evolution, 25(6):557.

Google Scholar

國際替代計量

全基因組選拔訓練族群之決定

未授權

主題瀏覽