利用AI生成圖像進行少樣本分類之研究

本研究探討AI生成圖像應用於少樣本分類的問題，其任務是增加資料集中的樣本多樣性，以提高模型的分類能力。現有的數據擴充方法，如影像旋轉、縮放和使用生成對抗網路產生新樣本是基於現有少數樣本而生成圖像，此類方法會導致數據仍不夠多樣。因此本研究利用生成式AI模型(DALL-E)生成多樣化圖片，可以有效增加資料集的多樣性。然而，我們發現直接將生成圖像加入到真實圖像的訓練集會降低模型準確率，因為生成圖像和真實圖像的特徵空間存在距離。因此，我們提出一個特徵轉換器，將生成圖像特徵映射到真實圖像特徵空間，以縮短兩者特徵空間之間的距離。實驗結果表明，生成圖像映射到真實圖像的特徵空間可以增加樣本的分佈數量，進而提升模型的分類能力。

關鍵字

少樣本分類；圖像生成；特徵轉換

並列摘要

The goal of this study is to improve the model's classification performance by increasing the diversity of samples in the dataset through the application of AI-generated images for few-shot classification. Existing methods of augmenting data generate images based on just a few sets of known samples, such as rotating and resizing images and using Generative Adversarial Networks (GANs) to generate new samples. It is possible that these methods might produce insufficiently varied data. This work generates a variety of images using a generative AI model (DALL-E) in order to successfully increase the diversity of the dataset. However, because there is a gap between the feature spaces of generated and real images, we observed that adding generated images directly to the training set of real images reduces the accuracy of the model. To minimize the distance between the two feature spaces, we propose a feature encoder that maps the features of generated images to the feature space of real images. Based on the experiments, the model's classification performance can be improved by increasing the distribution of samples through mapping the generated images to the real image feature space.

並列關鍵字

Few-Shot Classification ； Image Generation ； Feature Mapping

參考文獻

[1] Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., ... & Sutskever, I. (2021, July). Zero-shot text-to-image generation. In International conference on machine learning (pp. 8821-8831). Pmlr.

Google Scholar

[2] Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 conference on computer vision and pattern recognition workshop, pages 178–178. IEEE, 2004.

Google Scholar

[3] Zhang, Renrui, et al. "Tip-adapter: Training-free adaption of clip for few-shot classification." European conference on computer vision. Cham: Springer Nature Switzerland, 2022.

Google Scholar

[4] Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, and CV Jawahar. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition, pages 3498–3505. IEEE, 2012.

Google Scholar

[5] Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729. IEEE, 2008.

Google Scholar

主題瀏覽