透過您的圖書館登入
IP:18.118.173.146
  • 學位論文

基於GAN模型使用少樣本生成個性化字體

Generating personalized fonts with few-shot learning based on GAN model

指導教授 : 洪文斌

摘要


字形生成一直是一個具挑戰性的任務,尤其是在處理具有大量字元的語言如中文和日文時。創建新的字體不僅涉及到複雜的設計過程,還需確保整體風格的一致性,這使得字形設計成為一門少數人能夠精通的技術。 傳統的字體風格轉換方法主要依賴於監督式學習,它需要大量的配對數據,而這不僅難以獲得也非常耗時。此外,先前的方法通常僅提取整張圖像的通用風格,這在處理漢字時會導致不理想的結果,因為漢字的組成及筆畫風格受到位置影響而變化。 為了克服這些問題,本研究提出了一種新的策略,結合了多頭交叉注意力機制和預設的文字拆解策略。這允許模型在訓練時深入瞭解不同風格下字形的變化。值得注意的是,這種方法採用少樣本生成的方式,在提供少量參考樣式時,就能夠有效地生成符合特定風格的字體。 利用這一策略,不僅解決了配對數據集的收集問題,還確保了高質量的字體生成。實踐證明,該策略能成功實現大規模的字體風格轉換,拓寬了其應用範疇,為字體風格轉換領域提供了一種具有突破性的解決方案。

並列摘要


Glyph generation has always been a challenging task, especially when dealing with languages with a large number of characters such as Chinese and Japanese. Creating new fonts involves not only a complex design process but also ensuring a consistent overall style, which makes glyph design a skill mastered by only a few. Traditional methods for font style transformation mainly rely on supervised learning, requiring a vast amount of paired data which is not only difficult to obtain but also very time-consuming. Moreover, earlier approaches often only extracted the general style of the entire image, leading to suboptimal results when handling Chinese characters, as the composition and stroke style of the characters can change depending on their position. To overcome these issues, this study proposes a new strategy that combines multi-head cross-attention mechanisms and predetermined text decomposition strategies. This allows the model to deeply understand the variations of glyphs under different styles during training. Notably, this approach only requires a small number of reference styles to effectively generate fonts that match a particular style. Utilizing this strategy, we not only solved the problem of collecting paired datasets but also ensured the high quality of generated fonts. Practical evidence shows that this strategy can successfully achieve large-scale font style transformations, broadening its application scope and providing a breakthrough solution in the field of font style transformation.

參考文獻


[1] Melvin M. Vopso (2021),我们究竟产生了多少数据? 環球科學
[2] Zhu, J. Y., Park, T., Isola, P., Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223-2232).
[3] Gao, Y., Wu, J. (2020). GAN-Based Unpaired Chinese Character Image Translation via Skeleton Transformation and Stroke Rendering. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 646-653. https://doi.org/10.1609/aaai.v34i01.5405
[4] Li, W., He, Y., Qi, Y., Li, Z., Tang, Y. (2020, April). FET-GAN: Font and effect transfer via k-shot adaptive instance normalization. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 02, pp. 1717-1724).
[5] Mirza, M., Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.

延伸閱讀