Automatic speech emotion recognition technology is widely used in the field of intelligent human-computer interaction, which will greatly improve people's efficiency in work and study and improve their quality of life. In order to further improve the accuracy of speech emotion classification and solve the problems such as manual feature extraction, insufficient sample size and easy over-fitting of the model, this paper proposes a VPCNN model. Based on the method of transfer learning, the model adopts the structure of partial model of VGG16 network and introduces Batch Normalization (BN) to extract deep speech emotion features, which improves the generalization ability of the model. Secondly, principal component analysis is introduced to reduce the dimension of features, and softmax is used to classify emotions. The experimental results verify the effectiveness of the model in speech emotion classification and recognition.