  • 學位論文


Lyrics and Audio Features-based Hit Song Prediction

指導教授 : 馮勃翰


本研究目的在於探討單純使用音樂本質的資訊,是否有辦法預測一首歌曲是不是熱門歌曲?研究方法主要使用機器學習的技巧,資料方面使用了Spotify 所提供的音訊特徵與KKBOX 所提供的歌詞。研究中我們測試了單純使用歌詞的模型、單純使用音訊特徵的模型與兩者混合的模型,結果顯示使用了歌詞與音訊的模型表現高過單純使用歌詞的模型3%,而使用了音訊特徵的模型表現高過單純使用歌詞的模型2%。


The purpose of our research is to explore whether we can just use audio and lyrics-based features to predict if a song is a hit or flap. We mainly use the Machine Learning skills to build models and obtain the lyrics and audio data from KKBOX and Spotify, respectively. In our research, we build models using lyrics, audio and the mixture of lyrics and audio. The results show that the performance of the model based on mixture is 3% higher than the model based on lyrics and the performance of the model based on audio is 2% higher than the model based on lyrics.


Machine Learning Deep Learning Ensemble Methods NLP MIR


Bischoff, K., Firan, C. S., Georgescu, M., Nejdl, W., & Paiu, R. (2009). Social knowledge driven music hit prediction. In International conference on advanced data mining and applications (pp. 43–54).
Blume, J. (1999). Six steps to songwriting success: The comprehensive guide to writing and marketing hit songs. Billboard.
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.
Dhanaraj, R., & Logan, B. (2005). Automatic prediction of hit songs. In Ismir (pp. 488–491).
