全基因組選拔 (genomic selection; GS) 是一種強大而有效的工具,可用於協助判定具有潛力的雜交組合。在本研究中,利用具有已知基因型與外表型觀測值的訓練族群來建置 GS 模型,接著,使用所得的 GS 模型來預測所有感興趣的雜交組合之育種價估計值 (genomic estimated breeding value; GEBV),在此我們所使用的 GS 模型由加性和顯性分子標記效應 (additive and dominance effects) 組成。為調查不同估計分子標記效應的方法對 GEBV 的影響,我們透過統計模擬的方式來評估三種壓縮估計方法 (ridge regression, LASSO, elastic net)、三種貝式估計方法 (Bayes A, Bayes B, Bayes C) 以及線性混合效應模型 (linear mixed effects model; LMM),而外表型值包含以連續度量與序數評分測量的數量性狀。結果顯示,除 LASSO 和 Elastic net 這兩種估計方法外,大多數估算方法都能產生穩健的GEBV。另外,我們採用留一驗證法 (leave-one-out cross-validation),從這些穩健的估計方法中進一步決定每個性狀最合適的估計方法用於南瓜實際資料分析。最後,我們利用所建立好的 GS 模型,各自預測兩群南瓜 C. maxima 與 C. moschata 種內雜交之表現,並提供有用的訊息於育種者,以協助辨認具潛力的雜交組合與優良的親本。
Genomic selection (GS) is a powerful and efficient tool to identify potential hybrids in a hybrid breeding program. In our study, we typically build a GS model based on a training population with known genotypic and phenotypic values. Then, we use the resulting GS model to predict genomic estimated breeding values (GEBVs) for all the hybrid combinations of interest. The used GS model consists of both additive and dominance marker effects. We first evaluate three shrinkage estimations (ridge regression, LASSO, elastic net), three Bayesian estimations (Bayes A, Bayes B, Bayes C) and linear mixed effects model (LMM) estimation for the marker effects through simulation studies. The phenotypic values contain quantitative traits measured in continuous scale or ordinal score. It is shown that most of the estimation methods result in robust GEBVs, except LASSO and elastic net methods. We further determine the most appropriate estimation method from those robust methods for each trait of our pumpkin data using leave-one-out cross-validation. Finally, we predict hybrid performance for the two intra-crossing groups within C. maxima and C. moschata of pumpkin, and provide useful information for plant breeders to identify potential hybrids and superior parental lines.