這篇論文將傳統的可參數化的重疊分塊運動補償(POBMC)影像編碼技術融入增強式學習框架來實作視訊預測任務。過去以生成明確移動資訊來進行視訊預測的深度學習導向預測方法常遭遇到需要龐大學習參數進行移動資訊估測以及需要輔以人工正規化手段協助學習的問題。受到以稀疏移動向量為基礎的視訊壓縮方法啟發,我們提出了由少量關鍵像素點及對應移動向量所生成的稀疏移動資訊來進行可參數化調控的視訊預測。藉由迭代的方式,此方法可以逐步精緻對未來影像的估測來完成畫面的生成。其中關鍵像素的選定以及移動向量的估測是由兩個類神經網路經由增強式學習訓練而成。我們的方法在CaltchPed、UCF-101和CIF測試資料上的單步預測與多步預測結果皆達到現有方法的水準,證明了此方法即使以少量的訓練資料進行訓練也能展現不錯的預測效果。
This paper leverages a classic prediction technique, known as parametric overlapped block motion compensation (POBMC), in a reinforcement learning framework for video prediction. Learning-based prediction methods with explicit motion models often suffer from having to estimate large numbers of motion parameters with artificial regularization. Inspired by the success of sparse motion-based prediction for video compression, we propose a parametric video prediction on a sparse motion field composed of few critical pixels and their motion vectors. The prediction is achieved by gradually refining the estimate of a future frame in iterative, discrete steps. Along the way, the identification of critical pixels and their motion estimation are addressed by two neural networks trained under a reinforcement learning setting. Our model achieves the state-of-the-art performance on CaltchPed, UCF101 and CIF datasets in one-step and multi-step prediction tests. It shows good generalization results and is able to learn well on small training data.