提升物件追蹤中連續框遺漏補償方法

物件偵測是利用電腦視覺機器學習的技術，在圖像或影片中識別並分類物件。隨著模型設計愈趨複雜，物件偵測的準確度和速度也逐漸提升。YOLOv4 由於其較低的設備需求、快速運算速度以及在 MS COCO 測試集中的優異表現，成為許多應用中物件追蹤的選擇。然而，在連續影片物件偵測與分類的研究領域中，通常會採用更複雜的模型以提升偵測的準確度。這種方法將連續影片分解成多個圖片進行偵測，但容易產生物件框無法連續被檢測的問題，且模型無法明確知道物件的移動路徑。本研究以 YOLOv4 作為前端物件偵測的架構，再利用 DeepSORT 追蹤進行連續圖像中的物件追蹤。同時，我們提出了一種連續框物件補償方法 CFM，用以補償因圖像切割而導致的物件框遺漏，以實現物件的連續追蹤。我們設計的四組實驗證實 CFM能夠有效地補償追蹤影片中的目標物件的連續物件框，這四組實驗設計中，我們也分別以效能較佳的 YOLOv7-E6E 及 StrongSORT 替代 YOLO v4 及 DeepSORT 來檢視 CFM 效能，亦透過不同的參數設定來進行 CFM 效能分析。此外，我們使用 HOTA 與 MOTchallenge 指標來評估連續影片的計算方法，以驗證我們方法的有效性，並與其他物件追蹤模型進行比較。最終，實驗結果證明了 CFM 能夠在多遮蔽物的環境中，提高物件追蹤和識別的效能。

關鍵字

多重遮蔽物； YOLOv4 ；物件偵測；物件追蹤； DeepSORT ；物件框補償

並列摘要

Object detection is a computer vision machine learning technique used to recognize and classify objects in images or videos. With increasingly complex model designs, the accuracy and speed of object detection have significantly improved. YOLOv4, due to its lower hardware requirements, fast computation speed, and outstanding performance on the MS COCO test set, has become a preferred choice for object tracking in various applications. However, in the research field of continuous image object detection and classification, more sophisticated models are often adopted to enhance detection accuracy. This approach involves breaking down continuous images into multiple frames for detection, which can lead to issues such as non-continuous detection of object boxes and lack of information about the objects' movement paths.In this study, we employ YOLOv4 as the front-end object detection architecture and integrate it with DeepSORT for continuous object tracking in image sequences. Additionally, we propose a novel method called Compensation Frame Mechanism (CFM) to address the issue of missing object boxes caused by image segmentation, thus achieving continuous object tracking. We conduct four sets of experiments to demonstrate the effectiveness of CFM in compensating for missing object boxes. These experiments also compare CFM's performance with YOLOv7-E6E and StrongSORT, which replace YOLOv4 and DeepSORT, respectively. Furthermore, we perform CFM's performance analysis with different parameter settings. To evaluate the effectiveness of our method, we use HOTA and MOTchallenge metrics to assess the calculation method for continuous images, and we compare our proposed approach with other object tracking models. The experimental results conclusively demonstrate that CFM effectively improves object tracking and identification performance, especially in environments with multiple occlusions.