Aiming at the problem that traditional path planning methods depend on environment model, the path planning of mobile robots in unknown environment is studied by using Sarsa and Q-learning algorithms based on reinforcement learning. Firstly, the simulation environment is established by using grid map, and the algorithm and Q-value table are designed. Secondly, the reward value is obtained by the interaction between the robot and the environment, and the action strategy set is changed by modifying the Q-value table iteratively. Finally, the action of the robot tends to the optimal action set. The experimental results show that under the same iteration steps, the Sarsa algorithm tends to be conservative and converges slowly, while the Q-learning algorithm is easier to converge, can quickly plan the path, and has better exploration ability in unknown environment.