透過您的圖書館登入
IP:216.73.216.156
  • 學位論文

運用骨架提取於動作辨識之研究

The Study on Skeleton Extraction of Human Behavior Recognizing

指導教授 : 李維平
本文將於2026/10/14開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


人體動作識別,在深度學習領域中是一項熱門且具有挑戰性的目標,在現如今硬體設備愈發卓越的現在,針對動作識別的應用也如雨後春筍般出現,如醫護領域使用影像識別高齡長輩是否跌倒,健身的動作使否引發受傷;體育上有識別人體與擊球點來追尋球技的精進;賣場裡則有識別顧客對於商品進行的動作,以此可做為購買意願的參考,也有識別手扶梯上是否有不良的動作,以避免將發生的危險;載具上也有識別卡車司機是否有使用手機的不良動作,以提升用路人的安全。 由上面就可以看到各種不同的應用,而目前,骨架提取在動作辨識上對於精確度的提升有目共睹,但大多侷限於圖型,或是輔以提取出的關鍵點座標資訊來描繪圖型,鮮少直接使用骨架座標點資訊做為訓練標的。 本研究把座標資訊從二維,結合時間維度包裝成三維的「圖片」型式,並使用在圖片分類中頗有成效的VGG架構進行訓練,再與使用3+1D圖片訓練的ResNet、使用3+1D骨架圖像訓練的ResNet在所需消耗的時間與準確度上做比較。 以結果來看,骨架提取的圖像在KTH資料集可以將準確度從96.3%提升至98.71%,而使用骨架座標相對於目前以3+1D為主的模型在準確度上略輸三到五個百分比(93.75%),但在訓練時間上可以節省60%的時間消耗,辨識上則加快了約45%。

並列摘要


Recognizing human behavior is a popular and challenging goal in the field of deep learning. With the rapid development of hardware, applications for human behavior recognition have been sprung up, such as the use of images in the medical field to detect the falls of elders and the injuries which the wrong weight-training way may cause. At present, skeleton extraction has improved the accuracy in human behavior recognition, but most of them are confined to the field of graphics, or supplemented by the extracted joint coordinate to enhance the graphics. However, people seldom use the skeleton joint coordinate as the training target directly. In this study, the joint coordinate is packaged from two dimensions and time dimensions into a three-dimensional "picture" format, and applied in the VGG framework to train, which is a very effective way in image classification. Then we use three different methods to learn about the comparison of costing time and accuracy; one is the 3+1D ResNet trained with images, another is 3+1D ResNet trained with skeleton images, and the other is VGG-10 trained only with joint coordinate. From the results, the accuracy of the images extracted by the skeleton in the KTH dataset can be increased from 96.3% to 98.71%, and the use of skeleton joint coordinate is slightly less accurate than the current 3+1D-based model in terms of accuracy (93.75%), Nevertheless, it can save 60% of the time consumption in training, the recognition speed has increased 45%.

參考文獻


1. 郭明祥, 宋全軍, 徐湛楠, 董俊, 謝成軍. (2019). 基於三維殘差稠密網絡的人體行為識別算法. 計算機應用, 39(12), 3482–3489.
2. 王耀霆(2019).使用深層卷積網路實現多重極低解析熱影像之超解析影像重建.臺灣大學資訊工程學研究所學位論文.
3. 莊侑穎(2019).人體骨架預測與基於骨架之動作識別.國立交通大學多媒體工程研究所學位論文.
4. 紀閎全(2019).用於智慧購物車的動作辨識.交通大學資訊科學與工程研究所學位論文
5. 張銘仁(2019).適用於室內移動式機器人之人體動作辨識系統.國立臺灣師範大學資訊工程研究所學位論文

延伸閱讀