具長短期記憶之序列特徵選取方法__國立政治大學博碩士論文全文影像系統

上傳須知

帳號：guest(18.218.19.220) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中):	李宜臻
作者(英):	Lee, Yi-Jen
論文名稱(中):	具長短期記憶之序列特徵選取方法
論文名稱(英):	Feature Selection with Long Short-Term Memory from Sequential Data
指導教授(中):	蕭舜文
指導教授(英):	Hsiao, Shun-Wen
口試委員:	陳孟彰黃意婷
口試委員(外文):	Chen, Meng-Chang Huang, Yi-Ting
學位類別:	碩士
校院名稱:	國立政治大學
系所名稱:	資訊管理學系
出版年:	2019
畢業學年度:	107
語文別:	英文
論文頁數:	45
中文關鍵詞:	遞歸神經網路、特徵萃取、序列型資料、長短期記憶神經網路
英文關鍵詞:	Recursive Neural Network、Feature extraction、Sequential data、Long Short-Term Memory neural network
Doi Url:	http://doi.org/10.6814/NCCU202000305
相關次數:	推薦:0 點閱:111 評分: 下載:21 收藏:0

由單個物件有序組成的序列型資料在我們的日常生活中被廣泛的應用，如文本、視頻、語音信號和網站使用日誌……等。通常資料分析需要大量的人工和時間，然而近年來，神經網路在分類和各種自然語言處裡任務方面有很好的表現，儘管這些技術已經很完善，但是我們很難理解這些技術是使用什麼樣的訊息來實現其目標，如果只是一個簡單的分類器可能沒辦法達成知道是什麼樣的訊息這樣的需求，因此，我們提出了一個基於神經網路的特徵過濾器，用於分析序列型資料，以便從原始資料中過濾出有用也人類可讀的訊息，並在之後用於分類。
本文中，我們設計了一個神經網路框架 - filteRNN，該框架有一個過濾器的結構，透過這個過濾器，我們可以過濾有價值、人類可讀的特徵以進行後續分類，我們使用了惡意軟體及評論的文本資料來展示從原始資料過濾的功能，並將過濾後的資料輸入分類器進行分類，這個模型能夠過濾掉一半的原始數據。因為過濾器和分類器在這個框架中很重要，因此我們也透過嘗試不同的過濾器和分類器來檢查框架的有效性，同時我們也將注意力模型用來跟我們的框架進行比較。實驗結果顯示，我們可以提取不同序列型資料中，各類別的共有特徵，以供進一步研究。

Sequential data which consists of an ordered list of single object is in a wide range of applications in our daily life, such as texts, videos, speech signals and web usages logs. In general, the analysis of data requires a lot of human work and time. In recent years, neural networks (NNs) have achieved state-of-the-art performance in classification and a variety of NLP tasks. Although such techniques are well-developed, it is difficult for us to understand what information is used for reaching its goal. Such needs may not be satisfied by a simple classifier; hence we proposed an NN-based characteristics filter for analyzing sequential data in order to filter useful and human-readable information from the raw data for a further classifier.
In this paper, we design an NN framework (filteRNN) which embeds a filter structure that can filter valuable, human-readable features for latter classification. We use the datasets of malwares and reviews to demonstrate the capability of filtering data from raw data, and the filtered data are fed into a classifier for classification. The models are able to filter out half of the raw data. Since filters and classifiers play key roles in this task, we implement different filters and classifiers to examine the effectiveness of the framework. Besides, attention model is used to do the comparison with our framework. Experimental results indicate that we can extract the commonly shared characteristics of categories in different sequential datasets for further study.

1 Introduction 6
2 Related Work 8
2.1 Sequential Pattern Based Classiﬁcation 8
2.2 NN-approach Sequential Data Analysis 8
2.3 Attention Mechanism 9
3 Framework Design 11
3.1 ﬁlteRNN 11
3.2 Filter 11
3.3 Classiﬁer 12
3.4 Convolutional Neural Network 13
3.5 Recurrent Neural Network 14
4 Evaluation 17
4.1 Dataset 17
4.2 Preprocessing 20
4.3 Test Hyperparameters 23
4.4 Model Complexity 26
4.5 Loss Function Design 28
4.6 FilteRNN with Malwares 29
4.7 Generalization - IMDB Reviews 35
5 Conclusion 39
Reference 42

[1]Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition”,Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.Available: 10.1109/5.726791.
[2]I. Sutskever, O. Vinyals and Q. Le, “Sequence to sequence learning with neural networks”,in Advances in Neural Information Processing Systems, 2014, pp. 3104-3112.
[3]A. Rush, S. Chopra and J. Weston, “A Neural Attention Model for Abstractive Sentence Summarization”,arXiv.org, 2019. [Online]. Available: https://arxiv.org/abs/1509.00685.[Accessed: 18- Apr- 2019].
[4]D. Golub and X. He, “Character-level question answering with attention”,arXiv.org, 2016.[Online]. Available: https://arxiv.org/abs/1604.00727. [Accessed: 18- Apr- 2019].
[5]R. Agrawal and R. Srikant, “Mining sequential patterns”,Proc. of the 11th Int. Conf. on Data Engineering (ICDE), 1995, pp. 3-14. Available: 10.1109/icde.1995.380415 [Accessed18 April 2019].
[6]R. Srikant and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements”, in Int. Conf. on Extending Database Technology, 1996, pp. 1-17.
[7]M. Zaki,Machine Learning, vol. 42, no. 12, pp. 31-60, 2001. Available:10.1023/a:1007652502315 [Accessed 18 April 2019].
[8]J. Pei et al., “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth”, inInt. Conf. on Extending Database Technology, 2001, pp. 215-224.
[9]J. Pei et al., “Mining sequential patterns by pattern-growth: the PrefixSpan approach”,IEEE Trans. Knowl. Data Eng., vol. 16, no. 11, pp. 1424-1440, 2004. Available:10.1109/tkde.2004.77 [Accessed 19 April 2019].
[10]J. Ayres, J. Flannick, J. Gehrke and T. Yiu, “Sequential pattern mining using a bitmap representation”, in Proc. of the 8th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, 2002, pp. 429-435
[11]A. Wright, A. Wright, A. McCoy and D. Sittig, “The use of sequential pattern mining to predict next prescribed medications”,Journal of Biomedical Informatics, vol. 53, pp.73-80, 2015. Available: 10.1016/j.jbi.2014.09.003 [Accessed 19 April 2019].
[12]Y. Fan, Y. Ye and L. Chen, “Malicious sequential pattern mining for automatic malware detection”,Expert Systems with Applications, vol. 52, pp. 16-25, 2016. Available:10.1016/j.eswa.2016.01.002 [Accessed 19 April 2019].
[13]Y. Hsieh and M. Chen, “Toward Green Computing: Striking the Trade-Off between Memory Usage and Energy Consumption of Sequential Pattern Mining on GPU”, in IEEE First Int.Conf. on Artificial Intelligence and Knowledge Engineering (AIKE), 2018, pp. 152-155.
[14]D. Fradkin and F. Mörchen, “Mining sequential patterns for classification”,Knowledge and Information Systems, vol. 45, no. 3, pp. 731-749, 2015. Available: 10.1007/s10115-014-0817-0 [Accessed 19 April 2019].
[15] Y. Zhao et al., “Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns”,Machine Learning and Knowledge Discovery in Databases,pp. 648-663, 2009. Available: 10.1007/978-3-642-04174-7_42 [Accessed 19 April 2019].
[16]Y. Wang, M. Huang and L. Zhao, “Attention-based LSTM for aspect-level sentiment classification”, in Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing, 2016, pp. 606-615.
[17]J. Saxe and K. Berlin, “Deep neural network based malware detection using two dimensional binary program features,” inProc. Int. Conf. on Malicious and Unwanted Software(MALCON), 2015, pp. 11-20.
[18]S. Seok and H. Kim, “Visualized Malware Classification Based-on Convolutional Neural Network”,Journal of the Korea Institute of Information Security and Cryptology, vol. 26,no. 1, pp. 197-208, 2016.
[19]R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and A. Thomas, “Malware classification with recurrent networks”, in2015 IEEE Int. Conf. on Acoust., Speech, Signal Process. (ICASSP), 2015, pp. 1916–192
[20]S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi, “Malware detection with deep neural network using process behavior,” in 2016 IEEE 40th Annu. Computer Software and Applications Conf. (COMPSAC), 2016, pp. 577-582.
[21]B. Athiwaratkun and J. W. Stokes, “Malware classification with LSTM and GRU language models and a character-level CNN”, in2017 IEEE Int. Conf. on Acoust., Speech, Signal Process. (ICASSP), 2017, pp. 2482-2486.
[22]K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bougares, F. Bougares, H. Schwenk and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation”,arXiv.org, 2019. [Online]. Available: https://arxiv.org/abs/1406.1078.[Accessed: 18- Apr- 2019].
[23]D. Bahdanau, K. Cho and Y. Bengio, “Neural Machine Translation by Jointly Learning toAlign and Translate”,arXiv.org, 2019. [Online]. Available: https://arxiv.org/abs/1409.0473.[Accessed: 19- Apr- 2019].[24]J. Ba, V. Mnih and K. Kavukcuoglu, “Multiple Object Recognition with Visual Attention”,arXiv.org, 2019. [Online]. Available: https://arxiv.org/abs/1412.7755. [Accessed: 19- Apr-2019].
[25]V. Mnih, N. Heess and A. Graves, “Recurrent models of visual attention”, in Advances in Neural Information Processing Systems (NIPS), 2014, pp. 2204-2212.
[26]K. Gregor, I. Danihelka, A. Graves, D. Rezende and D. Wierstra, “DRAW: A Re-current Neural Network For Image Generation”,arXiv.org, 2019. [Online]. Available: https://arxiv.org/abs/1502.04623. [Accessed: 19- Apr- 2019].
[27] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio,“Show, attend and tell: Neural image caption generation with visual attention”, in Int. Conf.on Machine Learning (ICML), 2015, pp. 2048-2057.
[28]D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization”,arXiv.org, 2019.[Online]. Available: https://arxiv.org/abs/1412.6980. [Accessed: 19- Apr- 2019].
[29]“Build a Convolutional Neural Network using Estimators | TensorFlow”, TensorFlow,2019. [Online]. Available: https://www.tensorflow.org/tutorials/layers. [Accessed: 24- Feb-2019].
[30]“Malware Knowledge Base”,Owl.nchc.org.tw, 2018. [Online]. Available: https://owl.nchc.org.tw/. [Accessed: 14- Apr- 2018].
[31]S. W. Hsiao, Y. N. Chen, Y. S. Sun, and M. C. Chen, “A Cooperative Botnet Profiling and Detection in Virtualized Environment,” in Proc. IEEE Conf. on Communications and Network Security (IEEE CNS), 2013, pp. 154-162.
[32]“VirusTotal”,Virustotal.com, 2018. [Online]. Available: https://www.virustotal.com/zh-tw/.[Accessed: 14- Apr- 2018].
[33]A. Maas, R. Daly, P. Pham, D. Huang, A. Ng and C. Potts, “Learning Word Vectorsfor Sentiment Analysis,” in Proc. of the 49th annual meeting of the association for computational linguistics (ACL) : Human language technologies, 2011, pp. 142-150.
[34]“Opinion Mining, Sentiment Analysis, and Opinion Spam Detection”, SentimentAnalysis.html, 2019. [Online]. Available: https://www.cs.uic.edu/liub/FBS/sentiment-analysis.html. [Accessed: 24- July- 2019].
[35]M. Hu and B. Liu, “Mining and summarizing customer reviews”, in Proc. of the tenth ACMSIGKDD international conference on Knowledge discovery and data mining, 2004, pp.168-177.
[36]J. Pennington, R. Socher and C. Manning, “Glove: Global vectors for word representation”,in Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP),2014, pp. 1532-154

電子全文

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文