With the development of communication technology, many hosts have multiple network interfaces But the traditional transmission control protocol (TCP) only uses one network interface to establish a session, which wastes a lot of network bandwidth resources. MPTCP is a multi‐path transmission control protocol proposed by IETF. It can use multiple network interfaces in a session at the same time, which greatly improves the utilization of network resources. However, while using multi‐path transmission, MPTCP also faces problems such as packet loss and network congestion, which is easy to cause the decline of transmission performance. These problems need to be solved urgently. Based on the study of existing scheduling algorithms, this paper proposes a transmission scheduling algorithm based on Q‐Learning model, which can adapt to the dynamic changes of the network and reduce the network transmission delay. The experimental results show that the low latency transmission performance is 25% higher than the default minrtt transmission scheduling algorithm of Linux MPTCP.