  • 會議論文
  • OpenAccess

Audio Classification Using Semantic Transformation and Classifier Ensemble


This paper presents our winning audio classification system in MIREX 2010. Our system is implemented as follows. First, in the training phase, the frame-based 70-dimensional feature vectors are extracted from a training audio clip by MIRToolbox. Next, the Posterior Weighted Bernoulli Mixture Model (PWBMM) is applied to transform the frame-decomposed feature vectors of the training song into a fixed-dimensional semantic vector representation based on the pre-defined music tags; this procedure is called Semantic Transformation. Finally, for each class, the semantic vectors of associated training clips are used to train an ensemble classifier consisting of SVM and AdaBoost classifiers. In the classification phase, a testing audio clip is first represented by a semantic vector, and then the class with the highest score is selected as the final output. Our system was ranked first out of 36 submissions in the MIREX 2010 audio mood classification task.


