非監督式特徵學習與多重解析度直方圖池化

卷積神經網路(Convolutional Neural Network)因為其優異之影像分類表現，而廣泛地被運用在五花八門的問題中。然而，要訓練出一個有競爭力的卷積神經網路，必須先產生大量已經給定類別的資料，並且巧妙地決定訓練過程中的參數，這使得訓練卷積神經網路變得難以入手。為了保有卷積神經網路的高度辨識性又要避開前述的兩個問題，我們提出一個用來替代卷積神經網路的架構，這個架構使用我們提出的基於分群法卷積核產生機制，從尚未分類的資料中，利用k平均數叢聚(k-means Clustering)將資料分群，並取出每群的中心點、主成分(Principal Component)和獨立成分分析基底(Independent Component Analysis Basis)，來結合整體與區域的資訊，作為我們架構的卷積核。另一方面，我們的架構透過多重解析度直方圖池化，將特徵圖(feature map)切割成不同大小的區塊後，用直方圖表示每個區塊內值的分布，來做為更有效的分類特徵，加強資料分類的效果。跟先前使用CIFAR-10資料庫來分類的結果相比，我們的架構能夠在很短的時間內產生出具競爭力的結果。快速且容易訓練的這兩個優勢，使得我們的架構能夠很容易地被應用在各式各樣的問題中。

關鍵字

非監督式特徵學習；池化；卷積神經網路； k平均數叢聚

並列摘要

Convolutional neural networks have been successfully applied to various tasks owing to their brilliant performance on many image classification benchmarks. However, it is hard to train an effective convolutional neural network because of the tremendous effort required to label the training data and select appropriate training parameters. In order to keep the discriminative power of convolutional neural networks while avoiding these two problems, we propose an alternative architecture with multi-resolution histogram pooling using clustering-based kernels for unsupervised feature learning. The proposed clustering-based kernel generation approach is capable of producing more robust kernels from unlabeled training images by exploiting the centroids, principal components and independent component analysis bases of the underlying cluster structure. On the other hand, multi-resolution histogram pooling represents features maps by histograms of feature map regions in different sizes to make use of feature maps more efficiently. Compared to previous classification results using the CIFAR-10 dataset, our work produces a competitive result in a more efficient way. The scalability and rapidity of our architecture makes it easily apply to different image recognition problems with competitive results.

並列關鍵字

Unsupervised Feature Learning ； Pooling ； Convolutional Neural Network ； k-means Clustering

參考文獻

[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, pp. 2278-2324, November 1998.

[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in Neural Information Processing Systems, 2012, pp. 1097-1105.

[6] A. Vinnikov and S. Shalev-Shwartz, "K-means recovers ICA filters when independent components are sparse," in Proceedings of the 31st International Conference on Machine Learning (ICML-14), 2014, pp. 712-720.

[7] A. J. Bell and T. J. Sejnowski, "The “independent components” of natural scenes are edge filters," Vision Research, vol. 37, pp. 3327-3338, 1997.

[9] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma, "PCANet: A simple deep learning baseline for image classification?," arXiv preprint arXiv:1404.3606, 2014.

國際替代計量

非監督式特徵學習與多重解析度直方圖池化

全文下載

主題瀏覽