Despite being machines, many artificial agents, similar to humans, make biased decisions. The present article discusses when a machine learning system learns to make biased decisions and how to understand its potentially biased decision-making processes using methods developed or inspired by cognitive psychology and cognitive neuroscience. Specifically, we explain how the inductive nature of supervised machine learning leads to nontransparent decision biases, such as a relative ignorance of minority groups. By treating an artificial agent like a human research participant, we then review how to apply neural and behavioral methods from the cognitive sciences, such as brain ablation and image occlusion, to reveal the decision criteria and tendencies of an artificial agent. Finally, we discuss the social implications of biased artificial agents and encourage cognitive scientists to join the movement of uncovering and correcting machine biases.
人工代理者雖然是機器,但和人類一樣常會做出具有偏誤的決策。本文討論人工代理者中常用的機器學習系統何時會學習去做偏誤決策,以及如何使用認知心理學與認知神經科學中發展出來的方法來瞭解其具有偏誤的決策歷程。具體而言,我們會闡述本質上是歸納推理的監督式機器學習如何導致如忽略少數團體等不透明的決策偏誤。接著,我們會視一個人工代理者如一位人類研究參與者,回顧文獻中如何透過腦部切除與影像遮蔽等認知科學中的神經與行為方法來揭露一個人工代理者的決策準則與傾向。在文末,我們會討論有偏誤的人工代理者對於社會的影響,並鼓勵認知科學家們一同來揭示並改正機器的各種偏誤。