透過您的圖書館登入
IP:3.17.181.21
  • 期刊

以不同變異程度之近紅外光光譜模擬數據評估不同變數選取法之穩定性

Using Simulated Near Infrared Spectral Data with Various Degrees of Variation to Evaluate the Stability of Different Variable Selection Methods

摘要


近年來,利用近紅外光(near infrared, NIR)光譜分析物質化學成分,不但快速方便,且具有非破壞性及高選擇性兩種優點。但由於近紅外光光譜資料中所測得的波長變數數目十分龐大,且波長變數之間常存在高度的相關性,因此淨最小平方迴歸(partial least squares regression; PLSR)常被用來解決上述的問題。然而,利用淨最小平方迴歸分析時,都是使用所有的預測變數來進行分析,如此所建立之檢量模式易受到雜訊的干擾。若能妥善利用變數選取法將一些雜訊剔除,將可提升模式的建模效能與預測能力。本文主要是以常用的反覆預測權重法(iterative predictor weighting method; IPW)、光譜變方法(spectral variance method; SV)、訊息干擾比法(signal-to-noise ratio method; SNR)及遺傳演算法(genetic algorithms; GA)等四種不同的變數選取法,搭配模擬產生的不同變異程度數據進行其最佳模式之波長變數個數及建模效能和預測能力穩定性之分析研究。根據分析結果得知,在不同變異程度且進行1,000次的模擬數據分析中,遺傳演算法在各項統計參數均呈現最佳的效能,並不會因為數據變異程度的不同而出現時好時壞之情況。因此,在利用近紅外光光譜資料進行PLSR分析前,先利用遺傳演算法進行波長變數選取,將可建立波長變數個數及建模效能和預測能力穩定的檢量模式。

並列摘要


Recently, the methods using near infrared (NIR) spectral data to quantitatively determine chemical composition are quite common with the advantages of fast and convenient as well as non-destructive and high selective. With NIR spectral data, the number of variables is often more than sample size and high correlation exists among spectrum variables. To deal with such problems, partial least squares regression (PLSR) is frequently used to establish the calibration model with the full-spectrum variables. However, the performance of the model fitting and predictive ability of the calibration model with full-spectrum variables is prone to noise effect. Therefore, four variable selection methods including iterative predictor weighting method (IPW), spectral variance method (SV), signal-to-noise ratio method (SNR) and genetic algorithms (GA) were adopted for model evaluation, such as the number of spectrum variables, the performance of calibration and the ability of prediction, using the simulated data with various degrees of variability. As the results shown, after 1,000 simulated generations, the GA had the best performance than the other methods. Results suggest that when analyzing NIR spectral data, using GA to select spectrum variables may not only improve the performance of PLSR model, but also achieve a stable PLSR model.

延伸閱讀