透過您的圖書館登入
IP:3.143.218.146
  • 學位論文

運用影像前處理提升卷積神經網路於人物動作辨識之準確率

Use Image Preprocessing to Improve the Accuracy of Convolutional Neural Network in Human Action Recognition

指導教授 : 丁肇隆
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來在人工智慧與物聯網的發展下,人物動作辨識成為一大研究熱點,目的是為了讓機器能夠清楚理解人類動作的目的與意圖,應用上涉及醫療、教育、娛樂、視覺監控、影片索引等等。近幾年,卷積神經網路(Convolution neural network, CNN)在圖像分類上,具有不錯的成績,不過動作識別比圖片分類更具挑戰性,這是由於原始影片檔案大小比圖像來的大許多,以及影片中重複的影像造成冗餘。目前許多基於卷積神經網路的動作識別方法,都具有較高的計算成本,因此[1]提出了有效的方法直接在包含運動訊息的壓縮影片中訓練深度神經網路,不過該方法依然有改進的空間,本研究將灰階影像序列取代RGB影像序列,並在前處理部分,對資料進行小波轉換,去除噪音同時壓縮影像大小為原本的25%,此方法使計算成本變小,並且提高了約6.3%之辨識準確度。

並列摘要


In recent years, with the development of artificial intelligence and the Internet of Things, human action recognition has become a major invention project. The purpose is to allow machines to clearly understand the purpose and intention of human actions. Applications involve medical treatment, education, entertainment, visual monitoring, and video indexing. In recent years, Convolution neural network (CNN) has achieved good results in image classification, but action video recognition is more challenging than image classification. This is because the file size of video is larger than that of the image, and duplicate frames in the film cause redundancy. Recently, many action recognition methods based on convolutional neural networks have high computational costs. Therefore, [1] proposed an effective method to train deep neural networks directly in compressed videos containing motion information, but the method still can be improved. We using gray scale image sequence, instead of RGB image sequence, because we believe that gray scale image sequence has enough information to recognize human action. And We using wavelet transformation on the data in the pre-processing to remove noise and compress the image size to 25% of the original image size. The research methods reduces the computational cost and improve the accuracy about 6.3%.

參考文獻


參考文獻
1. Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu,R Manmatha, Alexander J Smola, and Philipp Krähenbühl, “Compressed video action recognition,”in CVPR, 2018.
2. White paper Cisco public, “Cisco Annual Internet Report (2018–2023)”, 2020 Cisco and/or its affiliates. All rights reserved
3. https://www.geminiopencloud.com/zh-tw/solutions/vsaas/
4. https://3smarket-info.blogspot.com/2019/11/2019-2027-vsaas.html

延伸閱讀