本論文的目的在應用迴歸模式來預測每個性別與年齡的每月泌尿碎石人次。同時我們也想要檢測在季節性的變動上結合泌尿結石疾病與氣候條件之間的影響。我們回顧性地分析來自2003到2008年台灣西北部的區域醫院的泌尿結石疾病患者之數據資料,這6年期間一共有7660例被診斷出為泌尿結石疾病。然而,我們也想了解碎石對於不同的年齡層與性別的影響力。 目前,大多數關於泌尿結石結合氣候條件的國際文獻研究中,幾乎很少文獻考慮到迴歸模式的共線性問題。因為不同的氣候因素可能會有高度的相關性,所以我們利用變異數膨脹因子來診斷不同的氣候因素是否有高度的相關性或共線性。而診斷出來的結果,氣溫與氣壓有極高度的相關。 迴歸模式常常被應用在預測、敘述和控制上。但是基本假設為觀察值必須互相獨立,然而本研究資料之間卻有相關性。因此先使用廻歸模式來建立統計預測模式。然而,在本研究中,我們是考慮5個氣候參數為自變數,分別是周遭溫度,日照時數,相對濕度,氣壓和降雨量。同時,我們也希望去檢測季節性的變動是否對每個月泌尿結石人數有影響,因此必須加上月分的虛擬變數(1至11月,第12月為基準變數)及時間趨勢。由於資料間具有相關性,假設廻歸模式的誤差項服從(Autoregression-Intergrated Moving average)ARIMA模式。模式驗證的方式是利用2003至2007年的資料來建立迴歸模式,並使用2008年的去驗證及比較2008年每個月泌尿結石的預測值。 迴歸模式的應變數是每月碎石人次,依據年齡層及性別來分類成數個群體。分類方式有三種: 第一個子群體的分類法是基於5個年齡層群而設計。他們分別是(20至29歲,30至39歲,40至49歲,50至59及60歲或以上)。為了研究每個年齡群體中泌尿結石與性別之間的關係,這些年齡群體再藉著男性和女性再來劃分10個年齡子群體,總共是10子群體。第二個子群體的分類法是接續第一個子群體的分類法,根據60歲以上的年齡層來劃分為2個年齡層群, 再藉著男性和女性來劃分2個年齡子群體,總共是4子群體。他們分別是60到70歲及70歲以上,這2個年齡群再根據男性及女性劃分為4個,所以總共為12個子群體。第三個子群體的分類法是藉著參考其他研究來設計3個年齡群體,他們分別為18至44歲,45至64歲及65歲以上。總共為6個子群體。 本研究對於不同年齡層及性別所建立的迴歸預測模式。然而,我們主要是對於這3種年齡層分類方式來建立合適的模式及模式的比較,其結果之ㄧ: 這3種分類方式中, 以第三種分類方式為最優於其他的分類方式,因為其他兩種分類方式的年齡區間較窄而導致人數逐漸變少,所以才會造成模式的建立有些許的困難。其結果之二:應用在2008年時,結果顯示出,不僅是周遭溫度連相對濕度和大氣壓力也明顯的影響泌尿結石的人數。並且,在季節性的變動上高峰期集中在7月、8月和9月。對於不同的性別與年齡,非常明顯的男性多於女性及30至50歲年齡層是主要受影響的族群。
The purpose of this thesis was to apply the regression model to predict the monthly number of Urinary lithotripsy (UL) for each gender and age. We also want to examine the association between the influences of the climate conditions on seasonal variations with UL disease at the same time. We retrospectively analyzed the data from 2003 to 2008 for the patients of UL disease in a regional hospital in northwestern Taiwan, a total of 7660 patients were diagnosed for UL disease during 6 year period. However, we also want to understand the influence of UL for the different age groups and gender. At present, most international literature research about the number of UL which associated with the climate conditions, there were almost few literatures considering the multicollinearity problem of the regression model. Because the different climate factors may have the highly correlation, we wanted to diagnose whether or not the different climate factor was highly correlation or multicollinearity with using the Va-riance Inflation Factor (VIF). The result of diagnosed, there was highly correlation for ambient temperature and pressure. The regression model was widely used in prediction, description and control. But the basic assumption was that the observations are mutually independent. However, the data in this research are correlated. Therefore, we use the regression model to es-tablish the statistical predicting model. However, in this study, we considered the as-sociation with 5 climate parameters which were independent variables, they were re-spectively ambient temperature, hours of sunshine, relative humidity, atmosphere pressure and amount of rainfall. At the same time, we also wish to examine whether the seasonal variation were had an influence for the monthly number of the UL or not, consequently it had to add the monthly dummy variable (1 to 11 month, twelfth month is basal variable) with time trend. As a result of the correlated data, we assume that the that the error term of the regression model follows (Autoregressive-Intergrated Moving Average) ARIMA model. The verification of the model which was used the data in 2003 and 2007 to establish the regression model. The data in 2008 are used to validate and then we apply it to compare the predicted value of monthly number of UL in 2008. The dependent variables were the number of the monthly Urinary lithotripsy for the regression model, they were classified some groups according to age groups and gender. There were three categories classification ways. The classified way of the first subgroups was created based on 5 age groups. They were respectively (20 to 29, 30 to 39, 40 to 49, 50 to 59 and 60 years old or older). For the purpose of studying the relation of UC and gender for each age group, these age groups were divided into the 10 age subgroups by male and female again, total were 10 subgroups. The classified way of the second subgroups continued the classi-fied way of the first subgroups, they were divided into 4 age groups according to the age of over 60 ages, and they were respectively 60 to 70 and over 70 years old. The 2 age groups were divided into 4 age groups according to the male and female. Thus, there were total 12 groups. The classified way of the third subgroups was created 3 age groups by referring other study. They were respectively 18 to 44, 45 to 64 and over 65 years old. There were total 6 groups. In this research, we establish the regression predicting model for the different age and gender. However, we mainly established the appropriate model for three catego-ries classification ways and made comparison of each model. The one of the results: For the three categories classification methods, the third classification was superior than the other classification ways, because the age interval of the first and second classification way was more narrow and it caused that the monthly NUL were gradu-ally became few. Therefore, it was slightly difficult to establish the appropriate model. The second results: When the predicting model was applied in 2008, the result showed that not only the ambient temperature but also relative humidity and atmosphere pressure had apparent influence for the number of UL. Moreover, the peak period was concentrated in July, August and September on seasonal variation. For the influence of the different gender and age, it was very apparent that male was more than female and the age of 30 to 50 was mainly affected group.