透過您的圖書館登入
IP:18.117.107.90
  • 學位論文

以小鼠探討紋狀體不同腦區在增強學習以及酬賞預測誤差中所扮演的角色

The Role of Striatal Subregions in Reinforcement Learning Process and Reward Prediction Error using Excitotoxic Lesion in Male Mice

指導教授 : 賴文崧
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


紋狀體分屬於基底核,是主要接收基底核訊息的腦區,更參與動作控制和酬賞相關的學習。近來的研究指出紋狀體與行動值以及酬賞預測誤訊號 (個體預期得到的酬賞和實際得到的酬賞之差異)的更新有關。紋狀體可進一步分成三個分區,各分區分別與不同種類的學習歷程有關。背內側紋狀體主要接收來自關聯皮層的訊息、與目標導向的行為學習有關;背外側紋狀體主接收來自感覺動作皮層的訊息、與習慣學習有關;伏隔核則被認為是表徵對未來酬賞預期的重要腦區,並可根據此預期進一步影響酬賞導向的行為選擇。然而,紋狀體內各分區在增強學習以及酬賞相關的學習中所扮演的角色、及其內在機制仍未有一定論。所以,本研究的目的為檢視不同的紋狀體分區在增強學習、酬賞預測誤訊號更新所扮演的角色,使用興奮性毀壞藥物注射紋狀體不同分區搭配二選項動態酬賞作業,觀察毀壞後小鼠的學習行為是否改變。本研究使用的二選項動態酬賞作業包含兩組不同的酬賞機率學習,小鼠的每次選擇都會被記錄。我們使用增強學習模型來分析資料,酬賞預測誤的相關參數估計使用貝氏估計法,另使用配對法則分析小鼠的選擇行為傾向。本研究結果顯示,背內側紋狀體毀壞小鼠在整個學習過程裡,相較於控制組小鼠,除了達到預設標準需要更多的選擇次數外,也在學習過程中累積更多錯誤。背外側紋狀體以及伏隔核毀壞小鼠則沒有展現整體學習行為上的差異。另使用增強學習模型分析,發現背內側紋狀體以及伏隔核毀壞小鼠皆有酬賞預測誤訊號更新速度下降、行為選擇一致性些微上升的情況。配對法則分析部分,沒有發現任何毀壞組及控制組的組間差異。整體而言,本研究證實了背內側紋狀體的功能損傷會影響酬賞相關學習和行為決策的表現。除此之外,亦證實背內側紋狀體以及伏隔核對於二選項動態酬賞作業的重要性,以及兩腦區皆在決策行為的價值評估、行為選擇兩部分扮演重要角色。

並列摘要


The striatum is the principal input structure of the basal ganglia that influences motor control and reward-based learning. Emerging studies indicate that it also contributes to update of action value and reward prediction error (RPE), a discrepancy between the predicted and actual rewards. Previous studies imply that three different subregions of the striatum participating in different kinds of learning processes. The dorsomedial striatum (DMS, also known as “associative striatum” in primates) which receives inputs from the association cortices is implicated in goal-directed behavior in rodents. The dorsolateral striatum (DLS, a part of the sensorimotor striatum in primates) is related to habit learning in rodents. The nucleus accumbens (NA) is implicated in representing predicted future reward, and the representation can be used to guide action selection for reward. However, the precise role or mechanism of each subregion in reinforcement learning and reward-based decision making is still under debate. The aim of this study is to examine the role of different striatal subregions (including DMS, DLS, and NA) in reinforcement learning process and reward prediction error using excitotoxic lesions and 2-choice dynamic foraging task in male C57/Bl6 mice. The 2-choice dynamic foraging task is a risky-choices task which consisted of two kinds of reward ratio learning. The behavioral performance of each of the three lesioned groups and their sham controls were recorded. Their trial-by-trial choice behavior were further analyzed and fit with a standard reinforcement learning model using the Bayesian estimation approach and matching law analysis to elaborate parameters for RPE and reward sensitivity. Compared to sham controls, overall behavioral results indicated that the DMS lesioned mice had more trials to reach the preset criteria and made more cumulated errors during the learning process of this dynamic foraging task. In contrast to the DMS group, both NA and DLS lesioned groups did not exhibited more accumulated trials or more cumulated errors. Reinforcement learning model analysis further revealed that both DMS and NA lesion mice had a lower learning rate in updating the RPE signaling and a slightly higher perseveration compared to their sham controls. But no significant difference was found in the reward sensitivity among the 3 groups. Collectively, the current study confirmed the importance of DMS and NA in the 2-choice dynamic foraging task and their roles in the value component and choice component of decision making. Excitotoxic lesion of DMS can significantly impair performance of probabilistic reward-based learning and decision making.

參考文獻


Albin, R. L., Young, A. B., & Penney, J. B. (1989). The functional anatomy of basal ganglia disorders. Trends in Neurosciences, 12(10), 366–375. doi: 10.1016/0166-2236(89)90074-X
Alexander, G. E., & Crutcher, M. D. (1990). Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends in Neurosciences, 13(7), 266–271. doi:10.1016/0166-2236(90)90107-L
Alexander, G. E., DeLong, M. R., & Strick, P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience, 9, 357–381. doi:10.1146/annurev.ne.09.030186.002041
Ambroggi, F., Ghazizadeh, A., Nicola, S. M., & Fields, H. L. (2011). Roles of nucleus accumbens core and shell in incentive-cue responding and behavioral inhibition. The Journal of Neuroscience, 31(18), 6820–6830. doi: 10.1523/JNEUROSCI.6491-10.2011
Ambroggi, F., Ishikawa, A., Fields, H. L., & Nicola, S. M. (2008). Basolateral amygdala neurons facilitate reward-seeking behavior by exciting nucleus accumbens neurons. Neuron, 59(4), 648–661. doi:10.1016/j.neuron.2008.07.004

延伸閱讀