HMN 2025: How AI picture models achieve artistic edge by amplifying low-frequency options

AI image models gain creative edge by amplifying low-frequency features — Original vs C3 (Ours). Compared to the unique diffusion models, Our C3 persistently generates extra artistic photos with no added computational price. Credit: *arXiv* (2025). DOI: 10.48550/arxiv.2503.23538

Recently, text-based picture era models can routinely create high-resolution, high-quality photos solely from pure language descriptions. However, when a typical instance just like the Stable Diffusion model is given the textual content “artistic,” its potential to generate really artistic photos stays restricted.

KAIST researchers have developed a know-how that may improve the creativity of text-based picture era models similar to Stable Diffusion with out further coaching, permitting AI to attract artistic chair designs which are removed from peculiar.

Professor Jaesik Choi’s analysis group at KAIST Kim Jaechul Graduate School of AI, in collaboration with NAVER AI Lab, developed this know-how to boost the artistic era of AI generative models with out the necessity for added coaching. The work is published on the arXiv preprint server the code is available on GitHub.

Professor Choi’s analysis group developed a know-how to boost artistic era by amplifying the inner characteristic maps of text-based picture era models. They additionally found that shallow blocks throughout the model play an important function in artistic era. They confirmed that amplifying values within the high-frequency area after changing characteristic maps to the frequency area can result in noise or fragmented shade patterns.

Accordingly, the analysis group demonstrated that amplifying the low-frequency area of shallow blocks can successfully improve artistic era.

News at KAIST — Overview of the methodology researched by the event group. After changing the inner characteristic map of a pre-trained generative model into the frequency area by Fast Fourier Transform, the low-frequency area of the characteristic map is amplified, then re-transformed into the characteristic area by way of Inverse Fast Fourier Transform to generate a picture. Credit: The Korea Advanced Institute of Science and Technology (KAIST)

Considering originality and usefulness as two key parts defining creativity, the analysis group proposed an algorithm that routinely selects the optimum amplification worth for every block throughout the generative model.

Through the developed algorithm, acceptable amplification of the inner characteristic maps of a pre-trained Stable Diffusion model was in a position to improve artistic era with out further classification knowledge or coaching.

The analysis group quantitatively proved, utilizing numerous metrics, that their developed algorithm can generate photos which are extra novel than these from present models, with out considerably compromising utility.

In specific, they confirmed a rise in picture range by mitigating the mode collapse downside that happens within the SDXL-Turbo model, which was developed to considerably enhance the picture era velocity of the Stable Diffusion XL (SDXL) model. Furthermore, person research confirmed that human analysis additionally confirmed a big enchancment in novelty relative to utility in comparison with present strategies.

Jiyeon Han and Dahee Kwon, Ph.D. candidates at KAIST and co-first authors of the paper, acknowledged, “This is the primary methodology to boost the artistic era of generative models with out new coaching or fine-tuning. We have proven that the latent creativity inside educated AI generative models might be enhanced by characteristic map manipulation.”

They added, “This analysis makes it straightforward to generate artistic photos utilizing solely textual content from present educated models. It is predicted to offer new inspiration in numerous fields, similar to artistic product design, and contribute to the sensible and helpful utility of AI models within the artistic ecosystem.”

This analysis, co-authored by Jiyeon Han and Dahee Kwon, Ph.D. candidates at KAIST Kim Jaechul Graduate School of AI, was introduced on June 16 on the International Conference on Computer Vision and Pattern Recognition (CVPR), a global educational convention.

More info:
Jiyeon Han et al, Enhancing Creative Generation on Stable Diffusion-based Models, arXiv (2025). DOI: 10.48550/arxiv.2503.23538

Journal info:
arXiv

Provided by
The Korea Advanced Institute of Science and Technology (KAIST)

Citation:
AI picture models achieve artistic edge by amplifying low-frequency options ( 20)
24
-ai-image-gain-creative-edge.html

.
. The content material is offered for info functions solely.

Related posts: