探討圖像資料增強對深度學習方法的影響

線性混合或組合多個圖像是計算機視覺中經常使用的圖像處理方法。而Mixup是近期崛起的一種線性運算方法，它能夠提高基於深度學習的模型的性能之外也提高訓練好的模型面對對抗性攻擊的有效性防禦能力，加上Mixup的方便性使得此方法獲得許多關注。但是，我們仍對線性處理後的效果和電腦背後隱藏的基本機制了解甚少。　　在這項研究中，我們研究了線性運算對圖像分類任務的影響以及提供未來可研究方向。我們主要將幾種自引用線性混合運算應用於圖像處理，並使用這些圖像評估在不同混合參數下基於深度學習的圖像分類器的性能，這項研究的貢獻在於建立一個基礎，以幫助人們可以更好地理解線性運算在計算機視覺中的潛在機制。

關鍵字

計算機視覺；圖像分類；數據增強

並列摘要

Linearly mixing or combining multiple images is a frequently used image processing methods in computer vision. Mixup, which is a kind of linear operations, shows its eﬀectiveness on improving the performance of deep-learning-based models and increasing the robustness of trained models against adversarial attacks. However, the eﬀect and the underlying mechanism of linear operations are little understood. In this study, we investigate the eﬀect of linear operations on the task of image classiﬁcation. We apply several self-referential linear-mixing operations to process images, and use these images to evaluate the performance of deep-learning-based image classiﬁers under diﬀerent mixing parameters. The contribution of this study is on establishing a foundation to better understand the underlying mechanism of linear operations.

並列關鍵字

Computer vision ； image classification ； data augmentation

參考文獻

[Krizhevsky 17] Krizhevsky, A., Sutskever, I., and Hinton, G. E.: ImageNet classiﬁcation with deep convolutional neural networks, Communications of the ACM, Vol. 60, No. 6, pp. 84–90 (2017)

Google Scholar

[Alex Krizhevsky, 09.]Learning Multiple Layers of Features from Tiny Images, (2009) https://www.cs.toronto.edu/~kriz/cifar.html

Google Scholar

[LeCun 99] et al. The MNIST Dataset Of Handwritten Digits. (1999)

Google Scholar

[Kaggle Inc. 18] Dogs vs. Cats Redux: Kernels Edition

Google Scholar

[LeCun 89] LeCun, Y., Boser, B. E., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W. E., and Jackel, L. D.: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Comput., Vol. 1, No. 4, pp. 541–551 (1989)

Google Scholar

國際替代計量

探討圖像資料增強對深度學習方法的影響

全文下載

主題瀏覽