Aiming at the problem that the traditional empirical replenishment method used by domestic vending machines affects the sales volume of vending machines, this paper puts forward a replenishment strategy that can be dynamically adjusted according to the predicted sales volume. This method first forecasts the sales demand, and then uses the reinforcement learning algorithm to train the proportional relationship between the commodity surplus and the replenishment quantity, so as to minimize the replenishment loss. Through the simulation of vending machine data provided by a platform in pycharm environment, it is concluded that the dynamically adjustable replenishment model can effectively reduce the replenishment loss on the basis of meeting the sales demand and maximize the interests of operators.