Investigation of Variation and Calibration in Experts' Distribution Judgments

大型的工程專案管理或複雜的企業決策往往透過專家的投入來取得分析所需之重要資訊，然當專家判斷中存在過度自信時，不僅可能影響決策品質，也影響整合專家意見時當採用的方法。本研究使用專家主觀機率之期望絕對離差與經由實現值所觀測到的絕對離差來定義新的連續型校準指標，並透過線性混合模型分析一個包含超過7000個真實專家機率區間估計的資料庫來探索專家機率判斷之校準與其變異來源，並修正過去研究中二元校準衡量的不穩定問題。透過更完整的資料，研究中發現在二元校準衡量下，過去的研究高估不同來源（專家、問題、實現值）所造成的變異，然而，即使將實現值之隨機變異納入考量，專家之間的變異仍顯著大於問題之間的變異。但連續型校準衡量下的分析則推翻上述的結論，在修正實現值的隨機變異之後，專家的變異明顯小於問題之間的變異，而嚴謹地敏感度分析亦驗證了此結論的穩定性。由於專家之間校準的差異明顯小於問題之間的變異，在實務上整合專家機率判斷時，運用種子問題來篩選較具專業知識或校準程度較高的專家可得的效果恐將有限，同時，由於問題與實現值之隨機變異較大，因此將需要大量的種子問題始能達到篩選專家的效果，因此簡單的線性意見彙整法仍有其重要性。

關鍵字

過度自信；校準；區間估計；專家判斷；專家整合

並列摘要

Overconfidence, as revealed in experts' probability judgments, can seriously affect decision quality. This study aims to investigate and partition the sources of variation in calibration scores of experts’ probability interval estimations. To assess the extent of overconfidence, the researchers of this study analyzed a large data set that presented real expert opinion and that included more than 7,000 distribution estimates. Also, to lessen the instability caused by the binary calibration measures used in previous research, the researchers developed a new continuous calibration measurement by applying the ratio between the absolute deviation of realization and the expected absolute deviation of experts' subjective probability. Furthermore, they analyzed and interpreted the data by means of a simpler linear mixed model. It was found that, with the binary calibration measure, the magnitude of random effects became significantly smaller than that in previous studies but that, after the realization effect was taken into consideration, the question effect was still considerably smaller than the expert effect. However, the researchers discovered that with the new continuous calibration measurement, the variance among experts was much smaller than the random variance among questions or realizations. A thorough sensitivity analysis showed that this finding was robust. This finding overthrows the analytic outcomes of binary calibration measurement. Thus, to use expert judgment in practice, the relatively smaller expert random effect may bode ill for differential weighting scheme, and the benefits may be limited by adopting seed questions to select a more professional expert or one at a higher calibration level. The number of seed variables might likewise need to be large enough to render a reliable weighting scheme.

並列關鍵字

overconfidence ； calibration ； interval estimate ； expert judgment ； expert aggregation

參考文獻

Baginski, S. P.,Conrad, E. J.,Hassell, J. M.(1993).The Effects of Management Forecast Precision on Equity Pricing and on the Assessment of Earnings Uncertainty.The Accounting Review.68(4),913-927.

Google Scholar

Budescu, D. V.,Du, N.(2007).Coherence and Consistency of Investors' Probability Judgments.Management Science.53(11),1731-1744.

Google Scholar

Budescu, D. V.,Erev, I.,Wallsten, T. S.(1997).On the Importance of Random Error in the Study of Probability Judgment. Part I: New Theoretical Developments.Journal of Behavioral Decision Making.10(3),157-171.

Google Scholar

Budescu, D. V.,Wallsten, T. S.,Au, W. T.(1997).On the Importance of Random Error in the Study of Probability Judgment. Part II: Applying the Stochastic Judgment Model to Detect Systematic Trends.Journal of Behavioral Decision Making.10(3),173-188.

Google Scholar

Bier, V. M.(2004).Implications of the Research on Expert Overconfidence and Dependence.Reliability Engineering and System Safety.85(1-3),321-329.

Google Scholar

國際替代計量

Investigation of Variation and Calibration in Experts' Distribution Judgments

全文下載

主題瀏覽