運用影像前處理提升卷積神經網路於人物動作辨識之準確率

近年來在人工智慧與物聯網的發展下，人物動作辨識成為一大研究熱點，目的是為了讓機器能夠清楚理解人類動作的目的與意圖，應用上涉及醫療、教育、娛樂、視覺監控、影片索引等等。近幾年，卷積神經網路(Convolution neural network, CNN)在圖像分類上，具有不錯的成績，不過動作識別比圖片分類更具挑戰性，這是由於原始影片檔案大小比圖像來的大許多，以及影片中重複的影像造成冗餘。目前許多基於卷積神經網路的動作識別方法，都具有較高的計算成本，因此[1]提出了有效的方法直接在包含運動訊息的壓縮影片中訓練深度神經網路，不過該方法依然有改進的空間，本研究將灰階影像序列取代RGB影像序列，並在前處理部分，對資料進行小波轉換，去除噪音同時壓縮影像大小為原本的25%，此方法使計算成本變小，並且提高了約6.3%之辨識準確度。

關鍵字

影像處理；卷積神經網路；人物動作辨識；影片分類；影像壓縮

並列摘要

In recent years, with the development of artificial intelligence and the Internet of Things, human action recognition has become a major invention project. The purpose is to allow machines to clearly understand the purpose and intention of human actions. Applications involve medical treatment, education, entertainment, visual monitoring, and video indexing. In recent years, Convolution neural network (CNN) has achieved good results in image classification, but action video recognition is more challenging than image classification. This is because the file size of video is larger than that of the image, and duplicate frames in the film cause redundancy. Recently, many action recognition methods based on convolutional neural networks have high computational costs. Therefore, [1] proposed an effective method to train deep neural networks directly in compressed videos containing motion information, but the method still can be improved. We using gray scale image sequence, instead of RGB image sequence, because we believe that gray scale image sequence has enough information to recognize human action. And We using wavelet transformation on the data in the pre-processing to remove noise and compress the image size to 25% of the original image size. The research methods reduces the computational cost and improve the accuracy about 6.3%.

並列關鍵字

image processing ； convolutional neural network ； human action recognition ； video classification ； image compression

參考文獻

Google Scholar

1. Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu,R Manmatha, Alexander J Smola, and Philipp Krähenbühl, “Compressed video action recognition,”in CVPR, 2018.