多層次分類器的理論研究及效能提升

雙叉決策樹（binary decision trees, BDTs）屬於最常用的分類方法之一，它透過每次將候選節點分割為兩個子節點來建構，每個子節點皆可再繼續分割下去，直到滿足停止條件為止，建構成最後的分類樹模型。文獻上，有學者提出另一種分類樹方法，它每次會將一個候選節點分割為兩種類型的子節點，一種是可以確定分類的子節點，一種是尚未分類確定的子節點，只有尚未分類的子節點可以再繼續分割下去，換句話說，每層分割產生之子節點中只有一個子節點允許繼續往下分割，所以稱為多層次分類器（multi-layer classifier, MLC）。雖然MLC有簡單的單一樹幹結構且其建構機制從統計資料分佈觀點上看似乎是合理的，但是它未被廣泛使用的原因主要在於缺乏理論根基及大量真實案例的測試。多層次分類器中的每層所分割出的確定分類子節點又可以是一個或兩個，當每層分割產生之子節點只有一個確定分類節點和一個未分類節點，我們稱此簡化版MLC為雙叉多層次分類器 (Binary MLC, BMLC)，我們先針對BMLC奠定其理論基礎，並提出利用變異比率 (variance ratio)的演算法建構BMLC，也稱爲變異比率雙叉多層次分類器（Variance-Ratio Binary MLC, VRBMLC）。除了理論基礎及演算法的建立，我們也透過公開數據集驗證了VRBMLC的優越分類效能。儘管VRBMLC具有更好的可解釋性及分類效能，然而雙叉分割導致建構的樹層比较深，爲了進一步降低VRBMLC所建構的樹的深度並提高其分類表現，本研究在VRBMLC基礎上，於每個候選節點允許分割出兩個確定分類子節點及一個未分類子節點的的三叉分割，稱為變異比率多層次分類器（Variance-Ratio MLC, VRMLC）。VRMLC每一層中只能用一個特徵來分割候選節點，本研究也進一步提出了一種新的多元多層次分類器，稱爲變異比率多元多層次分類器（Variance-Ratio Multivariate MLC, VRMMLC），除了可於每一節點整合多個特徵構建多元判別超平面 (multivariate discriminant hyperplane)，再一樣利用變異比率進行三叉分割，所建構的分類樹不但更加簡潔也可更有效率。本研究使用40個公開數據集及3個高維度數據集驗證VRBMLC，VRMLC和VRMMLC的效能。實驗結果表明VRBMLC比經典的三種BDT都來得容易解釋且能夠達到更好的分類效果。VRMLC和VRMMLC能進一步簡化VRBMLC所建構的分類樹，也因此解釋性可以更好，且比三種BDTs及四種经典的多元決策樹分類表現來得好。

關鍵字

分類；分類器；決策樹；多元決策樹；機器學習；樹的建構方法

並列摘要

Binary decision trees (BDTs) are among the most common classifiers. Typically, the tree model of a BDT is constructed by recursively splitting each node into two less impure child nodes. The child nodes can be split further until the stopping criteria are met. In the literature, Chang et.al., [1] proposed another type of classification tree called Multi-Layer Classifier (MLC) that split candidate nodes into two types of child nodes, i.e., the classified child node with purer instances and the unclassified child node with rather impure instances, from which only the unclassified node in each layer can be split further into the next layer, resulting in a single straight trunk structure. Despite the plausibility of MLC from the perspective of statistical data distributions, it has not been widely used due to the lack of theoretical basis and thorough performance tests of real cases. A typical MLC can generate one or two classified child nodes and one classified child node in each layer. Thus, a simpler version of MLC, called binary MLC (BMLC), is first proposed to allow only one classified child node and one unclassified child node. For BMLC, we first lay the theoretical basis, and then propose a variance ratio algorithm, referred to as the Variance-Ratio Binary MLC (VRBMLC). In addition to the theoretical and algorithmic development, we validate the superiority of VRBMLC’s performance over BDTs’ on various publicly available datasets. Though VRBMLC is effective with better interpretability, it generates a deep single straight trunk because only a univariate binary split is adopted in each layer. To further reduce the tree depth of VRBMLC and improve its classification performance, this study, on the theoretical basis of VRBMLC, further develops the theoretical foundation for the ternary split that allows two classified child nodes and one unclassified child node at each layer. Based on the developed theories, we propose a new variance ratio algorithm, referred to as Variance-Ratio MLC (VRMLC). Moreover, a multivariate version of VRMLC, called Variance-Ratio Multivariate MLC (VRMMLC), is proposed to integrate multiple features to construct a multivariate discriminant hyperplane at the node to be split. The variance-ratio algorithm to perform binary or ternary splits on the hyperplane can be also applied to efficiently construct a shorter, more compact single straight trunk. This study validates the performance of VRBMLC, VRMLC, and VRMMLC using 40 regular datasets and 3 high-dimensional datasets collected from well-known repositories. The experimental results show that VRBMLC methods are easier to interpret and achieve better classification results than the three state-of-the-art BDT methods. Furthermore, the proposed VRMLC and VRMMLC are found to have not only better interpretability than the VRBMLC by simplifying its tree structure but also better classification results than three state-of-the-art BDTs and four state-of-the-art multivariate trees.

並列關鍵字

classification ； classifiers ； decision tree ； multivariate decision tree ； machine learning ； tree construction

參考文獻

K. J. Chang et al., "Method for multi-layer classifier," U.S. Patent No. 8,572,006. , 2013.

Google Scholar

L.-L. Wang, H. Y. T. Ngan, and N. H. C. Yung, "Automatic incident classification for large-scale traffic data by adaptive boosting SVM," Information Sciences, vol. 467, pp. 59-73, 2018.

Google Scholar

M. Khashei and M. Bijari, "A novel hybridization of artificial neural networks and ARIMA models for time series forecasting," Applied Soft Computing, vol. 11, no. 2, pp. 2664-2675, 2011.

Google Scholar

P. Angelov and X. Gu, "MICE: Multi-Layer Multi-Model Images Classifier Ensemble," in 2017 3rd IEEE International Conference on Cybernetics (CYBCONF), 2017, pp. 1-8.

Google Scholar

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and regression trees. Boca Raton: Chapman Hall/CRC, 1984.

Google Scholar

主題瀏覽