基於強化學習以戰術草圖生成籃球對戰模擬

本篇論文介紹基於強化學習方法以籃球戰術草圖生成整個回合的對戰模擬，模擬進攻戰術的執行及防守方可能的對應行為，生成的結果經過視覺化可以讓教練更容易說明和分析戰術，也可以在賽後對比實際戰術執行的結果或檢討場上防守行為的表現。我們訓練兩種模型：進攻及防守模型，讓兩模型在同一環境中互動，模型根據環境所給過去幾秒的狀態作為觀察來做出行動，進攻模型額外根據輸入的戰術草圖當作條件並執行戰術指令。我們使用基於 actor-critic 改善的無模型算法 Proximal Policy Optimization (PPO) 訓練連續動作空間的模型。我們使用沃羅諾伊圖算出球員在球場上佔位，將佔位加總當成獎勵設計出一密集的獎勵函數來解決稀疏獎勵造成的訓練不易問題。我們在獎勵加上移動距離的懲罰項來改進生成的模擬結果。最後我們將模擬結果與真實資料做客觀數據上的比對來評估模型。

關鍵字

籃球戰術；強化學習

並列摘要

This paper presents a method to simulate basketball plays from tactic sketches. It simulates how the offense tactic will be executed as well as how the defenders will react. With the simulated plays, it is easier for coaches to illustrate and evaluate the newly developed tactic. Players can review their plays by comparing to the simulated ones after the game. To achieve the aim, we use reinforcement learning technique to model our problem. We use two agents to represent offense team and defense team, let two agents interact with each other in same environment. The offense agent has tactic instruction extracted from tactic sketch as additional condition. The agents decide actions according to the environment state. The environment takes actions and simulate the play. We use model-free algorithm Proximal Policy Optimization (PPO) to train our continuous action space model. To solve sparse reward problem, we design voronoi reward by considering player’s court space ownership. To improve the quality of simulation, we add an additional moving distance penalty. To evaluate our system, we do quantitative analysis to objectively compare between real and simulated plays.

並列關鍵字

Basketball Strategies ； Reinforcement Learning

參考文獻

[1] Nazanin Mehrasa, et al. Deep Learning of Player Trajectory Representations for Team Activity Analysis. In MIT SLOAN Sports Analytics Conference, 2018

Google Scholar

[2] Andrew C. Miller and Luke Bornn. Possession sketches: Mapping nba strategies. In MIT Sloan Sports Analytics Conference, 2017

Google Scholar

[3] D Cervone, et al. NBA Court Realty. In MIT SLOAN Sports Analytics Conference, 2016

Google Scholar

[4] Daniel Cervone, et al. A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes. Journal Of The American Statistical Association Vol. 111, Iss. 514, 2016

Google Scholar

[5] C.Y. Chen, et al. Generating Defensive Plays in Basketball Games. In ACM International Conference on Multimedia, 2018

Google Scholar

國際替代計量

基於強化學習以戰術草圖生成籃球對戰模擬

查找全文

主題瀏覽