簡易檢索 / 詳目顯示

研究生: 陳則光
Chen, Ze-Guang
論文名稱: 基於變分自動編碼器之解糾纏模型設計與應用:單細胞RNA定序之聚類與細胞擾動之預測
Variational autoencoder based disentangle model design and application: scRNA-seq clustering and cell perturbation prediction
指導教授: 葉家宏
Yeh, Chia-Hung
康立威
Kang, Li-Wei
口試委員: 葉家宏
Yeh, Chia-Hung
張傳育
Zhuan, Chuan-Yu
陳俊良
Chen, Jun-Liang
林俊秀
Lin, Jyun-Siou
口試日期: 2022/04/11
學位類別: 碩士
Master
系所名稱: 電機工程學系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 26
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202201313
論文種類: 學術論文
相關次數: 點閱:38下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在今天,深度網路已經應用於許多領域,包含產業界以及科學研究中,然而雖然深度網路可以自動生成出許多特徵已擬合出我們要求的結果,這些特徵卻難以被人類解讀。 當模型得出結果時,我們往往難以理解其是如何得出該結果以進行驗證其合理性,本研究的目標為設計可生成出更有解釋性特徵的基於變分自動編碼器模型,首先我們提出了可估計模型生成的特徵間的訊息相關性的方法,並藉此調控訓練過程中的超參數以使模型生成彼此訊息相互獨立的解糾纏特徵,並證明了使用這些解糾纏特徵可有效提升單細胞RNA定序的聚合正確度,本論文也提出了透過解開擾動不變訊息以預測細胞經擾動後的狀態,實驗證明這不只可以提升預測準確度,而且可以提供預測的根據,並可在某種程度上預測細胞經擾動前的狀態。

    Today, deep models have been used in many fields, including industry and scientific research. However, although deep models can automatically generate many features that fit the results we require, these features are difficult for humans to interpret. When a result is obtained, it is often difficult for us to understand if the predicted result is reasonable. The goal of this research is to design VAE based model that can generate more explanatory features. First, we propose a method that can estimate the information correlation between the features generated by the model, and adjust the hyperparameters in the training process to make the model generate disentangled features that are independent of each other. And proved using those features can effectively improve the aggregation accuracy of single-cell RNA sequencing. This paper also proposes a model to predict the state of cells after perturbation by unraveling the perturbation invariant information. Experiments show that this can not only improve the prediction accuracy, but also provide a basis for prediction, and to some extent predict the state of cells before perturbation.

    Chapter 1 Introduction 1 1.1 Motivation & Overview 1 1.2 Research Purposes & Results Overview 1 1.2.1 Unsupervised Disentanglement Representation Learning 1 1.2.2 Cell Perturbation Prediction 2 Chapter 2 Background & Literature review 3 2.1 Chapter Introduction 3 2.2 Variational Autoencoder (VAE) 3 2.3 Disentanglement 4 2.4 Single Cell RNA Sequence(scRNA-seq) 5 Chapter 3 Adaptive Variational Autoencoder 6 3.1 Chapter Introduction 6 3.2 VAE Loss Function Decomposition 6 3.3 Independence measurement 7 3.3.1 The compromise of total independence 8 3.4 scRNA-seq Clustering using disentangled variables 9 3.5 Datasets 10 3.6 Experiment results of disentangling 10 3.7 Comparison of scRNA-seq clustering 11 Chapter 4 Cell perturbation prediction by unraveling the perturbation invariant information 12 4.1 Chapter Introduction 12 4.2 Assumption 12 4.3 Overview of INVAE 12 4.4 The loss functions 15 4.4.1 The beta-TCVAE loss 15 4.4.2 The ANOVA information navigation loss 16 4.4.3 The total loss function 16 4.5 The training process 17 4.6 Datasets 18 4.6.1 The Haber dataset 18 4.6.2 The Kang dataset 18 4.6.3 The LPS dataset 19 4.7 Prediction results 19 4.8 Interpretability 20 4.9 The previous condition state prediction 22 Chapter 5 Conclusions 24 Reference 25

    [1]X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “Infogan: Interpretable representation learning by information maximizing generative adversarial nets,” Advances in neural information processing systems, vol. 29, 2016
    [2]R. T. Q. Chen, X. Li, R. B. Grosse, and D. K. Duvenaud, “Isolating Sources of Disentanglement in Variational Autoencoders,” Advances in Neural Information Processing Systems, vol. 31, 2018
    [3]Z. Xu, Z. Liu, C. Sun, K. Murphy, W. T. Freeman, J. B. Tenenbaum, and J. Wu, “Unsupervised Discovery of Parts, Structure, and Dynamics,” arXiv preprint arXiv:1903.05136, 2020
    [4]H. Yu, and J. D. Welch, “MichiGAN: sampling from disentangled representations of single-cell data using generative adversarial networks,” Genome Biology, pp. 1-26, 2021
    [5]D. P. Kingma, and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114.
    [6]M. Lotfollahi, M. Naghipourfar, F. J. Theis, F. A. Wolf, “Conditional out-of-sample generation for unpaired data using trVAE,” arXiv preprint, 2019
    [7]S. Bing, V. Fortuin, and G. Rätsch, “On Disentanglement in Gaussian Process Variational Autoencoders,” arXiv preprint, 2021
    [8]J. Schmidhuber, “Learning factorial codes by predictability minimization,” Neural computation, pp. 863-879, 1992
    [9]L. Matthey, I. Higgins, D. Hassabis, and A. Lerchner, “dSprites: Disentanglement testing sprites dataset,” URL https://github.com/deepmind/dsprites-dataset, 2020
    [10]P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter, “A 3D face model for pose and illumination invariant face recognition,” in Proceedings of IEEE international conference on advanced video and signal based surveillance, pp. 296-301, 2009
    [11] K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big data, pp.1-40, 2016
    [12] A. L. Haber, M. Biton, N. Rogel, R. H. Herbst, K. Shekhar, C. Smillie, G. Burgin, T. M. Delorey, M. R. Howitt, Y. Katz and I. Tirosh, “A single-cell survey of the small intestinal epithelium,” Nature, pp.333-339, 2017
    [13] H. M. Kang, M. Subramaniam, S. Targ, M. Nguyen, L. Maliskova, E. McCarthy, E. Wan, S. Wong, L. Byrnes, C. M. Lanata, and R. E. Gate, “Multiplexed droplet single-cell RNA-sequencing using natural genetic variation,” Nature biotechnology, pp.89-94, 2018
    [14] T. Hagai, X. Chen, R. J. Miragaia, R. Rostom, T. Gomes, N. Kunowska, J. Henriksson, J. E. Park, V. Proserpio, G. Donati, and L. Bossini-Castillo, “Gene expression variability across cells and species shapes innate immunity,” Nature, pp.197-202, 2018

    下載圖示
    QR CODE