The Research of Speech Emotion Recognition Based on VPCNN Model

Automatic speech emotion recognition technology is widely used in the field of intelligent human-computer interaction, which will greatly improve people's efficiency in work and study and improve their quality of life. In order to further improve the accuracy of speech emotion classification and solve the problems such as manual feature extraction, insufficient sample size and easy over-fitting of the model, this paper proposes a VPCNN model. Based on the method of transfer learning, the model adopts the structure of partial model of VGG16 network and introduces Batch Normalization (BN) to extract deep speech emotion features, which improves the generalization ability of the model. Secondly, principal component analysis is introduced to reduce the dimension of features, and softmax is used to classify emotions. The experimental results verify the effectiveness of the model in speech emotion classification and recognition.

關鍵字

Speech Recognition ； Emotional Classification and Recognition ； Convolution Neural Network ； Principal Component Analysis

參考文獻

Williams C E,Stevens K N. Emotions and speech: some acoustical correlates[J].The Journal of the Acoustical Society of America, 1972, 52(4B): 1238-1250.

Van Bezooijen R, Otto S A, Heenan T A. Recognition of Vocal Expressions of Emotion:A Three-Nation Study to Identify Universal Characteristics[J]. Journal of Cross-Cultural Psychology,1983,14(4):387-406.

Turgut Özseven. Investigation of the effect of spectrogram images and different texture analysis methods on speech emotion recognition[J]. Applied Acoustics, 2018,142(DEC.): 70-77.

Pan S J, Yang Q. A Survey on Transfer Learning[J].IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359.

Minsky M. The society of mind[M]. Simon & Schuster,1985.

Google Scholar

國際替代計量

The Research of Speech Emotion Recognition Based on VPCNN Model

全文下載

主題瀏覽