由功能性磁振造影進行視覺影像重建之深度神經網路

功能性磁振造影 (fMRI) 是一種非侵入性的大腦功能造影工具，其原理是利用磁振造影來測量神經元活動所引發之血液動力的改變。當人接受到不同的刺激時，腦部的不同區域會有不同的活化反應，而我們期望藉由分析不同時間的腦部資料，可以推斷出當時受測者所接收到的刺激。在此研究中，我們主要是要針對視覺刺激及其對應的腦部反應來做影像重建。本論文主要分為兩大部分，第一部分是實驗資料的收集，我們讓受試者觀看許多不同圖片並同時收取其腦部反應。由於此實驗主要接收視覺刺激且刺激材料中包含人臉，因此我們主要取出視覺區跟梭狀臉區來做進一步的分析。接著第二部分則是模型的訓練。為了重建出腦部反應對應的視覺刺激，我們參考了對抗式生成網路並訓練了一個三階段的模型來實現這個目標。此模型第一階段是單純利用刺激影像訓練出一組生成器與編碼器，可將影像編碼至有效的數值空間，並從此空間重建出對應的影像; 模型的第二階段則將影像透過訓練好的編碼器編譯成一組在特徵空間的特徵值，同時將受試者的腦部資料對應到特徵空間上，並希望與影像編譯出的特徵值越相近越好; 第三階段是把上個階段取出的腦部反應特徵值與對應的刺激材料一起放進對抗式生成網路訓練，期望能從腦部特徵值生成出當時所接收到的刺激影像。

關鍵字

功能性磁振造影；視覺影像重建；深度神經網路

並列摘要

Functional Magnetic Resonance Imaging (fMRI) is a noninvasive functional brain imaging tool. The blood flow in a region increases when the area of the brain activates. Since the cerebral blood flow and neuronal activation are coupled, we can measure the activated region by fMRI. When people engage in different type of activities, the activity of neuron fluctuate constantly in different part of human’s brain. We expect to infer the stimuli that the subject receive by analyzing the corresponding brain reaction. In this work, we aim to reconstruct the images that the subject viewed in the experiment from corresponding brain response. There are two major steps in this work, the first step is collecting the data in the visual experiment. We asked the subject to view different images and collect the brain response at the same time. This experiment is mainly about visual stimulation and reconstruct the stimuli, so we specify the visual cortex as Region of Interest (ROI). Because visual stimuli contain human faces, Fusiform Face Area (FFA) is also included in ROI. The second part is to train a model for the purpose of reconstruction. For reconstructing the stimuli, we modified Adversarial Generator-Encoder Networks to build a three-stage architecture to reach this goal. First stage of our model is to train a pair of encoder and generator by using the original visual stimuli, the encoder can encode the images into the latent space efficiently, and the generator can generate images that is similar to the ground truth images from latent space. The purpose of the second stage is encoding the images into a set of encoded features, while extracting features from the fMRI data, which are forced to be close to encoded features. In the last stage, a new pair of encoder and generator were both trained by using the extracted features from fMRI and their corresponding images. We expect the final result of this three-stage network can generate the images viewed by the subject during the fMRI experiment.