以循環卡爾曼網路的深度學習輔助吊車之吊物擺盪預測

有許多研究者研究吊車的自動化控制策略，他們的研究都需要得知吊車的物理參數並推導物理模型才能實現控制方法，然而實務上無法觀察到吊車的物理參數阻礙了現有的自動化方法在現地的應用，而有經驗的吊車操作者可以只藉由視覺觀察吊物的運動行為來預測吊物的未來位置以決定控制策略。若將視覺預測的方法自動化並應用於吊車上，就可以解決吊車自動化方法需要物理參數的限制，並提升應用於工地現場的可行性，但只以視覺預測吊物未來的位置最大的挑戰在於電腦無法理解影像中像素組合的抽象含意。因此本研究利用循環卡爾曼網路(Recurrent Kalman Network, RKN)善於解析高干擾且高維度狀態問題的特性來預測吊物的未來位置。方法是先利用數值模型所產生的單擺側視圖來訓練循環卡爾曼網路，使其能夠透過觀察連續的單擺影像數據學習到吊物的運動模式，並產生出訓練模型，之後再透過攝影機蒐集吊物運動的影像資料來預測吊物未來的位置。本研究比較長短期記憶網路(Long Short-Term Memory, LSTM)與RKN對吊物未來位置的預測能力。結果顯示，對比長短期記憶網路22.0像素的預測平均誤差，循環卡爾曼網路預測的平均誤差達到1.63個像素。循環卡爾曼網路除了可以捕獲時序關係之外，還能計算資料維度之間的關聯性以及不確定性，因此RKN不需要得到相關的物理參數，只給定影像資料就能夠預測吊物的未來位置。

關鍵字

深度學習；機器學習；人工智慧；時序預測；電腦視覺；吊物系統

並列摘要

Many researchers study the automatic control strategy of cranes, their research needs to know the physical parameters and derive physical models to achieve the control method. However, the physical parameters of the crane cannot observe in practice, which hinders the application of the crane automation methods insite. On the other hand, experienced crane operators can decide the control strategy by visually observing the movement of the payloads and predicting the future position. If a visual prediction method is automated and applied to the crane, which can solve the limitation of the physical parameters of the crane automation method, and the feasibility of applying it to the construction site can be improved. It is a challenge to predict the future position of payloads only by the vision, because computers cannot understand the abstract meaning of the combination of pixels in images. Therefore, this study employs the Recurrent Kalman Network (RKN), which is adept in the problem of high-interference and high-dimensional state, to predict the future state of the payloads. We use the side view of the pendulum generated by the numerical model to train RKN, which learns the motion pattern of the payloads by observing the continuous pendulum image, then get the model. After that, use the camera to collect image data of the movement of the payloads and predict the future position. This study compares the ability of Long Short-Term Memory (LSTM) and RKN to predict the future position of payloads with images. The results show that the prediction error of RKN reached 1.63 pixels on average, compared with the 22.0 pixels of the LSTM. In addition to capturing the sequence relationship, RKN can also calculate the correlation and uncertainty between data dimensions. Therefore, RKN can predict the future position of the payloads without physical parameters but only given image data.

並列關鍵字

deep learning ； machine learning ； artificial intelligence ； time series prediction ； computer vision ； payload system

參考文獻

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., ... Ghemawat, S. (2016). TensorFlow: large-scale machine learning on heterogeneous systems. Retrieved from https://www.tensorflow.org/

Google Scholar

Abderrahim, M., Gimenez, A., Nombela, A., Garrido, S., Diez, R., Padrón, V. M., Balaguer, C. (2001). The design and development of an automatic construction crane. In Proceedings of 18th International Symposium on Automation and Robotics in Construction, pp. 149-154.

Google Scholar

Becker, P., Pandya, H., Gebhardt, G.H., Zhao, C., Taylor, C.J., Neumann, G. (2019). Recurrent Kalman Networks: factorized inference in high-dimensional deep feature spaces. arXiv: 1905.07357.

Google Scholar

Chen, Q., Cheng, W., Gao, L., Fottner, J. (2019). A pure neural network controller for double‐pendulum crane anti‐sway control: based on Lyapunov stability theory. Asian Journal of Control.

Google Scholar

Chollet, F. (2015). Keras. Retrieved from https://keras.io

Google Scholar

國際替代計量

以循環卡爾曼網路的深度學習輔助吊車之吊物擺盪預測

未授權

主題瀏覽