An image model at Midjourney’s level?



A new beta version of Stable Diffusion delivers much more aesthetic and photorealistic results than the previous version. Will this make commercial offerings obsolete?

While Stable Diffusion is the most developed open-source image model, it can’t always match the quality and especially the accessibility of commercial competitors like Midjourney.

Its strength so far is not so much in generating aesthetic images after entering a few commands, but in its openness and the possibility of further development by a constantly growing community.

Stable Diffusion XL: Beta available via DreamStudio and API

While Stable Diffusion v2.1 was already a visible leap over v1.5, at least in some scenarios, the latest version, Stable Diffusion XL (v2.2.2), marks a significant improvement. It is still under development, but a beta version is already available via the paid DreamStudio web interface and API. The code will be released on GitHub as usual once it is finished.


We are pleased to announce the latest release in our Stable Diffusion series of imaging solutions. SDXL offers a variety of image generation capabilities that are transformative across multiple industries, including graphic design and architecture, with results happening right before our eyes.

Tom Mason, CTO of Stability AI

Stable Diffusion XL comes with a number of enhancements that should pave the way for version 3. Exactly how the training material differs from previous versions is unknown. However, 80 million images are said to have been removed for v3 at the request of artists.

“Minimalistic home gym with rubber flooring, wall-mounted TV, weight bench, medicine ball, dumbbells, yoga mats, high-tech equipment, high detail, organized and efficient.”

Compared to v2.1 with 900 million parameters, SDXL is also significantly larger with 2.3 billion. According to Stability AI CEO Emad Mostaque, the plan is to have a distilled version ready by the time of release and offer it as an alternative.

Stable Diffusion XL delivers more photorealistic results and a bit of text

In general, SDXL seems to deliver more accurate and higher quality results, especially in the area of ​​photorealism. Human anatomy, which even Midjourney struggled with for a long time, is also handled much better by SDXL, although the finger problem seems to have not been solved yet.

“Skilled archer, bow and quiver of arrows, standing in forest clearing, intense, detailed, high detail, portrait”.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top