透過您的圖書館登入
IP:18.223.119.17
  • 學位論文

氣象資料利用泛函主成分分析在農業上的應用

Analysis of Meteorology data using Functional Principal Component Analysis in Agriculture

指導教授 : 蔡政安
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


農產品產量或收穫期等資料直接或間接影響農產品在市場上的價值,因此準確預測農產品的產量或收穫期能幫助政府、商人或農民更有效率的調配資源。最容易取得且對農業生產有影響的資料為氣候資料,然而橫跨整個生長期的氣候資料維度通常非常高,一般的變數選拔方法無法選出一組固定且有顯著效應的自變數,另外每筆氣象資料間常有明顯的相關性,使回歸系數的估計值會因其共線性而不穩定。欲克服此問題,本研究將氣象資料視為函數資料,使用泛函主成分分析將氣象函數轉換成主成分計分並降維,目的是將資料轉換成低維度且彼此間沒有相關性的資料,且包含原始資料的大部分資訊。最後以變系數模型建立主成分計分和應變數間的回歸模型,並以集團正規化方法選擇需要納入模型中的主成分,本研究使用的集團正規畫方法包含Group Lasso、Group Bridge、Group SCAD和Group MCP。模擬研究結果顯示在模擬特徵值為等比數列的情境時,Group Bridge方法的預測效果是最好的,模擬特徵值為一樣大的情境時,Lasso和Ridge模型的預測能力則較佳。在實際研究中,本研究使用行政院農業委員會茶改良場提供的茶葉試驗調查資料、行政院農業委員會農糧署提供的水稻產量公開資料以及中央氣象局提供的氣象資料進行分析。結果顯示茶葉資料分析中,因為樣本數目較少的關系,導致其中一年的資料以集團正規化方法的預測效果較差,水稻資料分析中,則是Group Bridge模型的預測效果最佳。

並列摘要


Agricultural information such as yield and harvest period directly or indirectly affect the price of agricultural products in the market, thus accurately predicting the yield or harvest period of agricultural products can help governments, businessman or farmers allocate resources more efficiently. The most readily and influential data for agricultural production is weather data. However, the dimension of weather data across the entire growing season is usually very large. It is difficult for variable selection methods to select a set of variables with fixed and significant effect. Meanwhile, there is often a significant correlation between weather data, making the estimates of regression coefficients unstable due to their collinearity. To overcome these problems, we consider weather data as functional data, using functional principle component analysis to convert weather function into uncorrelated, low dimensional principle component score, whereas most of the information is more likely to be retained in major components. We use a varying-coefficient model to capture the relationship between the independent variable and principle component scores, and using group regularization methods to conduct variable selection, which include Group Lasso, Group Bridge, Group SCAD and Group MCP. Our simulation study shows that the prediction effect of the Group Bridge method outperforms other methods when the simulated eigenvalues are in the case of a geometric series. Otherwise, the Lasso and Ridge models have better predictive power when simulating eigenvalues are in the same size. In real data analysis, we use the tea testing survey data provided by the Tea Research and Extension Station, COA, the open source data on rice production provided by the Agriculture and Food Agency Council of Agriculture, Executive Yuan, and weather data provided by the Central Weather Bureau. The results showed that in the analysis of tea data, due to the small sample size, the data prediction effect of certain year was poor when using group regularization methods. In the rice data analysis, the Group Bridge model had the best prediction accuracy.

參考文獻


1. UN. World Population Prospects - Population Division - United Nations. 2019; Available from: https://population.un.org/wpp/DataQuery/.
2. 行政院農業委員會. 農業指標. 2019; Available from: https://agrstat.coa.gov.tw/sdweb/public/indicator/Indicator.aspx.
3. 行政院農業委員會農糧署. 農糧統計. 2019; Available from: https://www.afa.gov.tw/cht/index.php?code=list&ids=324.
4. Hoogenboom, G., Contribution of agrometeorology to the simulation of crop production and its applications. Agricultural and Forest Meteorology, 2000. 103(1-2): p. 137-157.
5. Kandiannan, K., et al., Crop-weather model for turmeric yield forecasting for Coimbatore district, Tamil Nadu, India. Agricultural and Forest Meteorology, 2002. 112(3-4): p. 133-137.

延伸閱讀