透過您的圖書館登入
IP:3.145.58.169
  • 學位論文

正則化收斂與特徵選取

SelectNet: Feature selection based on regularization loss

指導教授 : 歐陽彥正
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


可解釋性在機器學習中是很重要的一部分,尤其現在越來越多強大的深度模型被 應用在各式各樣的問題中,其為人詬病的便是黑箱決策。特徵選取是一種理解資 料的方法,透過降低輸入空間的維度,亦能更掌握資料的特性。我們提出了一個 簡單的網絡層 SelectNet,使用特徵空間上的正則化損失,迫使模型在端到端的訓 練中使用較少的特徵。我們在兩個人工合成資料集上應用 SelectNet,藉此驗證特 徵選擇的能力,以及兩個真實世界的問題,以顯示找到關鍵特徵的好處,並且加 強驗證實際應用的效果。我們的模型顯示了增加可解釋性上的好處,而不會損害 準確性。由於該方法避免了來自那些不必要特徵的噪音,因此模型便能更加穩健。 SelectNet 可以採用任何進階網路架構作為其下游模型,而不僅僅是全連接層。我 們將它應用於具有 CNN 層的 MNIST,與基準相比,它仍然實現了相同的性能, 這也顯示了不需要的像素。

關鍵字

特徵選取 深度學習 正則化

並列摘要


Interpretability is an increasingly significant issue in machine learning, especially in deep learning. Recent developments in Deep learning have raised the need for interpretability of black box models. Feature selection is a way to help people understand difficult problem, by explaining the dataset. We propose a simple network layer, SelectNet, using regularization loss on feature space to force the model to use the less features in end-to-end training. We apply SelectNet on 2 synthesized datasets to examine the ability of feature selection and 2 real world problems to show the benefit from finding the key features. Our model shows what features are actually in use, without harming the accuracy. Since this method avoid noise from those unnecessary features, the model becomes more robust. SelectNet can take any modern network architecture, not just fully connected network, as its downstream model. We apply it on MNIST with CNN layer, and it still achieves same performance as benchmark does, which also shows what pixels are unnecessary.

參考文獻


[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. 2017. Attention Is All You Need. In NIPS
[2] Čehovin, Luka & Bosnic, Zoran. 2010. Empirical evaluation of feature selection methods in classification. Intell. Data Anal.. 14. 265-281. 10.3233/IDA-2010-0421.
[3] Devlin, J., Chang, M., Lee, K., & Toutanova, K. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv,abs/1810.04805.
[4] Gui, Jie & Sun, Zhenjun & Ji, Shuiwang & Tao, Dacheng & Tan, Tieniu. 2016. Feature Selection Based on Structured Sparsity: A Comprehensive Study. In IEEE Transactions on Neural Networks and Learning Systems. 1-18. 10.1109/TNNLS.2016.2551724.
[5] Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, and Huan Liu. 2017. Feature selection: A data perspective. In ACM Computing Surveys

延伸閱讀


國際替代計量