透過您的圖書館登入
IP:18.119.133.96
  • 期刊
  • OpenAccess

家庭收入遺失值之插補研究

A Study of Imputating Missing Data for Household Income

摘要


「家庭平均月收入」在許多研究中皆為重要的影響或觀察變數,但其發生項目無反應(item nonresponde)的機會很高。本研究以2006年「台灣寬頻網路使用調查」資料集作為實證資料,分別以熱卡法、眾數插補法、機率分配插補法、多元羅吉斯迴歸法及整合插補法等方法進行「家庭平均月收入」遺失值之插補研究。並利用2005年「台灣寬頻網路使用調查」資料集驗證評估結果的一致性。研究結果發現,對家庭平均月收入而言,個人教育程度及居住城鄉別為較佳的插補輔助變數。整體而言,以使用個人教育程度及居住城鄉別為解釋變數之「多元羅吉斯迴歸法」為最適插補模型;但若考慮插補後結構不變情況下,則以依個人教育程度及居住城鄉別分層之「熱卡法」為最適插補模型。

並列摘要


'Household Income' is one of the main factors that will significantly affect many social issues. Due to privacy considering, many people do not willing to answer their household income and lead to item nonresponse happen. In this research, the adaptable imputation model for household income will be exhibited. Base on the data of '2006 Survey of Internet Broadband Usage in Taiwan' and compared the imputation effects of the Hot Deck, the Mode Imputation method, the Multinomial Logistic Regression, and the Multiple Imputation method, 'the education degree' and 'town of resident' were found to be the best auxiliary variables to impute the missing data for household income. Generally, the Multinomial Logistic Regression with 'personal education degree 'and 'town of resident', has the best imputation assessment. But if considering with the goodness of fit of the imputation data structure, the Hot Deck using 'personal education degree' and 'town of resident' as auxiliary variables, will be the better imputation model. In order to evidence the universality of our conclusion, data of '2005 Survey of Internet Broadband Usage in Taiwan' was used and the result showed it has consistency.

參考文獻


Sentas P.,Angelis L.(2006).Categorical missing data imputation for software cost estimation by multinomial logistic regression.Journal of Systems and Software.79(3),404-414.
Little, R. J. A.,Rubin D. B.(1987).Statistic Analysis with Missing Data.John Wiley & Sons..
Little,R. J. A.(1988).Missing data adjustments in large survey.Journal of Business and Economic Statistics.6,287-289.
Little, R. J. A.,Rubin D. B.(2002).Statistic Analysis with Missing Data 2nd edition.John Wiley & Sons..
Pyle, D.(1999).Data preparation for data mining.Morgan Kaufmann Publishers.

被引用紀錄


陳淑君(2010)。老年症候群之盛行狀況與其相關因素〔碩士論文,長榮大學〕。華藝線上圖書館。https://doi.org/10.6833/CJCU.2010.00067

延伸閱讀