透過您的圖書館登入
IP:3.149.240.75
  • 學位論文

Information Bottleneck應用於深度學習網路之行為特徵分析

Information Bottleneck in DNN Behavior Analysis and its Applications

指導教授 : 吳家麟

摘要


深度類神經網路(Deep Neural Network : DNN)可以說是現代人工智慧崛起的主要技術核心,它能夠透過學習將非常複雜的問題轉換成一個輸出輸入之間的非線性函數關係; 在電腦視覺、語言處理、影像處理等多個領域都屢創佳績,其問題處理的準確率甚至超越了人類的能力。 但我們對DNN模型(model)內部的運作原理仍幾乎是一無所知。例如,無法設計出一個衡量標準來幫助我們對於不同的任務建構最合適的模型架構(model architecture)、無法解釋DNN如何從學到的知識進行判斷、對於DNN已知的對抗性攻擊(adversarial attack)無法有穩定性的保證等問題。這也使得DNN被詬病為是一個 "黑盒子",無法讓人對其做出的決策有絕對的信心。 近年來越來越多人嘗試想要為DNN提供運作行為的可解釋性,其中一個較理論的方向就是基於資訊理論(Information Theory)。資訊理論已經在數位通訊、資料壓縮等領域運用已久,研究者們試圖建立起DNN與資訊理論間的連結,來分析甚至進一步地優化DNN,相關論述中又以Tishby等人的資訊瓶頸(Information Bottleneck : IB)最為著名。本文介紹了一些資訊理論與DNN之間的連結及應用,並主要探討基於IB的DNN運作原理分析以及相關文獻中出現過的正反觀點,總結出在不同情況所觀察到的資訊變化未必能真實反映出網路學習到的資訊含量,這會影響到基於資訊理論在DNN上的應用其效能或可行性。

並列摘要


Deep neural networks have become the technical cores of modern artificial intelligence research in recent years. They are able to turn a complex problem into non-linear relation between inputs and outputs by learning, and they have achieved practical success in many tasks, such as computer vision, natural language processing and image processing. Surprisingly, there are many scenarios in which DNNs could even outperform human. Despite their great success, very little is known about inner organization or theoretic principle of DNNs. For instance, we could not find a standard to construct the most appropriate model architecture for specific task or explain how DNNs learn from structured knowledge (training data); nor could we guarantee robustness of DNNs against adversarial attacks. These make DNNs be called “black box”, which makes it difficult for people to have confidence in the decisions being made by DNNs. However, more and more research attempts to offer reasonable explanations of learning with DNNs in recent years, one of these directions is based on information theory which has been applied in digital communication and data compression for many years. Researchers have attempted to connect DNNs and information theory in order to analyze or further optimize DNNs performance. In these related points of view, “information bottleneck” proposed by Tishby et al. is widely used in different applications. The aim of this thesis is to investigate the different perspectives of information bottleneck and to introduce applications or analytical methods in DNNs behavior. In conclusion, we discuss whether there exist genuine physical meanings in observed information variation while training DNNs.

參考文獻


1. Cover, T. M.; Thomas, J. A., Elements of information theory. John Wiley Sons: 2012.
2. Elad, A.; Haviv, D.; Blau, Y.; Michaeli, T., The effectiveness of layer-by-layer training using the information bottleneck principle. 2018.
3. Wu, T.; Fischer, I.; Chuang, I.; Tegmark, M., Learnability for the Information Bottleneck. 2019.
4. Tishby, N.; Pereira, F. C.; Bialek, W. J. a. p. p., The information bottleneck method. 2000.
5. Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.; Xiao, C.; Prakash, A.; Kohno, T.; Song, D. In Robust physical-world attacks on deep learning visual classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018; 2018; pp 1625-1634.

延伸閱讀