MythraGen: Two-Stage Retrieval Augmented Art Generation Framework
2026-06-22 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors developed a new method called MythraGen to create artistic images from text descriptions. They combine searching for similar art in a big database with a way to fine-tune an image generator so it learns specific artistic styles. By using this approach, their system produces artworks that better match what the user describes. Tests with real art data show their method works better than other current tools.
text-to-image generationartistic image generationretrieval augmented generationLoRAStable Diffusionfine-tuningfeature extractionWikiArt datasetgenerative models
Authors
Quang-Khai Le, Cong-Long Nguyen, Minh-Triet Tran, Trung-Nghia Le
Abstract
Text-to-image generation has seen rapid advancements, especially with the development of generative models. However, challenges remain in achieving high-quality, contextually accurate image outputs that faithfully match the provided textual descriptions, especially in artistic generation. In this paper, we present a simple yet efficient retrieval augmented generation framework, namely MythraGen, for text-to-artistic image generation by integrating an art retrieval mechanism with LoRA-based model fine-tuning. Our method extracts features from a large-scale art dataset, optimizing the generation process by combining artist-specific styles and content. Particularly, retrieved images from an external art database that have the highest similarity to the query prompt are used to finetune Stable Diffusion using LoRA for desired art generation. Experimental results and user studies on the WikiArt dataset show that our proposed method can generate artworks that closely match the user's input, significantly outperforming existing solutions.