k折交叉驗證檢驗均方誤差的偏誤和變異數之間的抵換關係: 模擬研究

本文研究k折(fold)交叉驗證(Cross-Validation)法下的檢驗的均方誤差(testing mean squared error)的偏誤(bias)和變異數(variance)的抵換關係理論。理論顯示隨著折(fold)的數量增加數檢驗的均方誤差的變異數隨之變大而偏誤隨之變小的關係。本文使用模擬線性回歸模型，利用交叉驗證法計算出檢驗的均方誤差。而電腦模擬的結果發現資料樣本數越大，偏誤和變異數抵換關係理論與實務經驗一致。亦即，隨著折的個數增加，檢驗的均方誤差的偏誤會越小而變異數會越大。

關鍵字

交叉驗證；檢驗的均方誤差

並列摘要

This paper studies the problem that the theorem shows testing mean squared error's bias and variance trade-off relationship. The theorem shows that as the number of folds increases the testing mean squared error's variance becomes larger and the bias becomes smaller. This paper uses simulate a linear regression model and uses Cross-Validation to calculate the testing mean squared error. The results of computer simulation found that the larger of the number of data samples, the theorem of the bias and variance trade-off relationship is consistent with practical experience. That is, as the number of folds increases, the testing mean squared error's bias becomes smaller and the variance becomes larger. keywords: Cross-Validation, testing mean squared error