利用遷移學習和微調模型分類畫作的藝術風格

現今大部分的分類問題都建立在自然圖像中的物體識別，鮮少運用在藝術品上。而藝術一直以來都是讓大眾摸不著頭緒的事情，尤其有很多著名的畫家追求的不再是實體事物的呈現，往往繪畫出的作品已不是在現實世界能看見的東西，甚至超脫出人類的認知。所以在對藝術接觸不深的一般人面前，要鑑賞出畫作內的價值實屬不易，如果無法了解畫作想表達的意義，畫作的風格就變得更難以捉模，要判別出兩幅畫作風格之間的差異，也就不是件容易的事情，在對於畫作風格分類的這個問題上，要提高分類準確率就變得障礙重重。本研究提出了一種基於深度學習技術的方法要嘗試讓電腦欣賞世界名畫，並對畫作進行風格分類。實驗資料是選擇開放資料集WikiArt，本研究從數百種畫作種類的藝術品，挑選出較獨特的藝術風格，對於挑選出來的風格進行分類訓練。本研究採用了三種方法，第一個是以遷移學習(Transfer Learning)的方式，把CNN(Convolutional Neural Network)模型中的ResNet50模型與傳統機器學習的SVM互接，將深度學習模型作為一種特徵抽取器使用，第二個是對ResNet50模型進行微調，提升ResNet50的分類能力，第三個則是把FCN (Fully Convolutional Network)搭配區塊投票機制，獲得更客觀的預測結果，看以上三種方法是否有助於電腦去理解畫作在不同風格之間有無規則可遵循。綜合以上方法的實驗結果顯示訓練準確率為99.92%、驗證準確率為70.22%和測驗準確率為72.84%。

關鍵字

模型微調；電腦視覺；深度學習；畫作風格分類； FCN ； CNN ；遷移學習； SVM

並列摘要

Most recent classification problems are based on object recognition in nature images. Not many are used in the artwork classification problem. Art has always been something that makes the public confused. In particular, many famous painters no longer pursue the presentation of physical objects and objects in their paintings are not often seen in real life. Furthermore, they are even going beyond human cognition. It is not easy to appreciate the value in the paintings for people who are not familiar with art. If one person can’t understand the meaning of the paintings, it is harder for him to understand the style of the paintings. In short, identifying the difference between the styles of two paintings is no simple task and there are several challenges to improve the accuracy of painting styles classification. This research proposes three deep learning based methods which try to make computers appreciate world famous paintings and classify the painting styles. The experimental data used in this study is WikiArt, which is an open data set. From this dataset, this research selects hundreds of artwork painting styles for special artistic styles. The first approach is to adopt the manner of transfer learning. We use the CNN model, ResNet50, as a feature extractor followed by the traditional classifier SVM. The second approach is to fine-tune the ResNet50 model followed by another multi-layer perceptron. The third approach is to adopt FCN (Fully Convolutional Network) with a patch voting mechanism to obtain more robust prediction results. We discuss the effectiveness of the above three methods in assisting the computer to understand the styles of paintings. The third experimental result demonstrates that the training accuracy rate is 99.92%, the verification accuracy rate is 70.22%, and the test accuracy rate is 72.84%.

並列關鍵字

Model fine-tuning ； Computer Vision ； Deep Learning ； Painting styles classification ； FCN ； CNN ； Transfer learning ； SVM

參考文獻

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun “Deep Residual Learning for Image Recognition,” arXiv:1512.03385arXiv:1512.03385v1 for this version

Google Scholar

[2] Corinna Cortesand Vladimir Vapnik, “Support-Vector Networks,” Machine Learning. 1995, 20 (3): 273–297. doi:10.1007/BF00994018.

Google Scholar

[3] L. Y. Pratt, “Discriminability-Based Transfer between Neural Networks,” NIPS Conference: Advances in Neural Information Processing Systems 5. Morgan Kaufmann Publishers. 1993: 204–211.

Google Scholar

[4] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Computer Vision–ECCV 2014. Springer, 2014, pp. 818–833.

Google Scholar

[5] Jonathan Long, Evan Shelhamer and Trevor Darrell, “Fully Convolutional Networks for Semantic Segmentation,” arXiv:1411.4038arXiv:1411.4038v2for this version.

Google Scholar

國際替代計量

利用遷移學習和微調模型分類畫作的藝術風格

未授權

主題瀏覽