Comparing image generation AIs

6 minutes

Last week I wrote about trying DALL·E and what image generation AIs will mean for our future. In that post, I mentioned that DALL·E was just one of the many AIs that specialize in generating images from natural language prompts, but I didn’t go into details.

In this post, I will show you how 4 different AIs respond to the prompt I used to generate the cover image for that post. This is in no way exhaustive research, but it will give you a hint of what’s coming up going on.

The prompt

dark blue and soft pink abstract geometric image that represents progress, technology and humans

The more specific you get, the most exciting results you’ll typically get, but for the sake of this exercise, this prompt is complex enough to have fun and simple enough to make comparisons easy. Let’s break down this prompt before we start, and define our acceptance criteria.

  • dark blue and soft pink – color attributes
  • abstract, geometric image – style
  • progress, technology, and humans – subject

The results

DALL·E

Accessing DALL·E was a hard one, as for now it’s on closed beta with a waitlist. I signed up on their website and a couple of weeks later I got the confirmation email. DALL·E has a limited number of credits that are recharged every month. If you run out of credits (spoiler: you will) and you want more (spoiler: you will) you can always buy more to continue creating.

✅ Color – I could go with a softer pink, but the color is good enough for what I am looking for.

❌ Style – Two of the images show humans, which goes against the abstract part of the prompt. This turned out to be a good thing in this case. More on this later.

✅ Subject –

Craiyon

Craiyon (formerly called DALL·E mini) had the best experience, as I just had to go to their website and I was able to generate my images right from there. No betas, no channels, no email verification.

Color – Only 3/9 options have what I would call a soft pink; the rest are more on a fuchsia, neon pink, or violet, but in no way soft.

Style – They are abstract and geometric images.

Subject – The images are SO abstract and SO geometric that it’s impossible to connect them to humanity or technology. Some of the patterns may be used to illustrate progress though, but it’s a very long shot if you ask me.

Dreamstudio

They have a public beta and it was super fast to access their online UI. They had a lot of different parameters I could use, but I went straight to entering the prompt, changing only the number of images to 4 to get a few results to compare.

✅ Color – Pinks are soft in Dreamstudio, so these images would fit my use case better (remember that we were looking for images to illustrate a post for this blog.)

✅ Style – Geometric and abstract, check.

❌ Subject – Again, it’s very hard to make a connection between these images and technology, humans or progress.

Midjourney

Midjourney is a bot! You can generate images using public Discord channels (think Slack/Teams) where you can also see other images generated by the community. I personally felt a little weird asking in a public channel, but I can see how this also helps users moderate each other. One thing I really liked about Midjourney is that I could see how the images were generated in real time, unlike the other tools that just share the final results.

✅ Color – Midjourney was the best at color, or at least it was for my personal taste and what I had in mind. I really liked how soft the pinks and how dark the blues were.

✅ Style – Midjourney gets it in 3/4 of the images, as for some reason the first image of the set is more of a blurry dream.

❌ Subject – The images are modern, but saying that they represent progress would be a stretch. Technology and humanity are also hard to see in these images.

Thoughts & conclusions

Different styles

Each AI had its one take at my prompt, just like 4 different graphic designers would probably create different results and styles with my brief. Since working with AIs is fast and cheap (actually free for now) I’d recommend you play around with the free options available: the results may surprise you.

DALL·E knew better

Better than other AIs, and even better than myself at guessing what I would like. Let me explain this: While DALL·E was the one AI that failed at the style of the prompt. It did not produce a 100% abstract result, as two of the images it generated for me had realistic human faces. Funny enough, these two images are precisely the ones I liked the most.

This is part of the magic of AI: It can help us diverge from what we believe we know, and discover options we may have missed without AI. This isn’t new: We all know that sometimes, when humans get together, something magic happens and maths rules bend to make it possible that 1+1>2. Coaches and motivational speakers have called this serendipity, synergy, co-creation. In the AI world it’s called augmentation. More about it in a future post.

From this prompt I found an image I could use for my post, but also a few images I plan to use for other parts of my website. AI can’t beat the work of a real designer, but can surely make things look a bit more interesting if you’re like me, short of time, color taste, graphic imagination and Illustrator skills.

A range of experiences

Another headline of this comparison is that we’ll have these AIs in a range of different experiences. Just looking at these 4 we have free and paid models, open betas, closed betas, open and authenticated use, a bot… We have AIs that can just respond to written prompts and AIs where we can configure the number of options, generate variations or even indicate other pictures we want considered for the result.

The topic of user experience in image generation AIs could have me writing for a while so I will save some of my comments about this for a future post.

What this tells us about the future

It’s a wonderful time to explore, learn and get inspired. I anticipate we will very soon be able to generate images with our voices (making this accessible for people who can’t write) , integrate AI image generation in our #random channels at work, and even go one step further and generate dynamic images or videos. Think of what this will mean for independent creators, and how it will change the media world, the video games industry, and a lot of our jobs.

I encourage you to go on and try at least one of these AIs. You can reach each of them using the links in the titles. I would love to know your thoughts!

PS: The featured image was generated by Dreamstudio.

  • 5 estrategias y 1 plantilla para mejorar tus reuniones

    Odio las reuniones, ¡ya lo he dicho! Pero me encantan las sesiones colaborativas. En este artículo, comparto métodos para convertir reuniones superficiales en espacios de co-creación. La facilitación, la agenda, el prompter, la interactividad y el feedback son clave. Además, para reuniones híbridas, trátalas como si fueran online. Menos reuniones, ¡más productividad!

  • JIRA: Épicas, historias de usuario, tareas, bugs y tipos de tarea personalizados

    Jira es una herramienta fantástica para gestionar tareas y hojas de ruta. Con funciones para crear roadmaps y la integración con otras herramientas, Jira es esencial para roles como Scrum Master o Product Owner. Aprenderás sobre los tipos de incidencias en Jira, cómo crear los tuyos y cómo optimizar su uso. ¡Aprovecha al máximo esta…

  • 5 Proven methods to prioritize your product backlog

    Effective prioritization is crucial for product managers. It helps allocate resources to the most valuable work and ensures the product is developed efficiently. Keep reading to learn more about prioritization and how it benefits stakeholders and aligns with business goals. Prioritization is an important aspect of product management because it helps ensure that the needs…

Create a website or blog at WordPress.com

Discover more from Cristina Santamarina

Subscribe now to keep reading and get access to the full archive.

Continue reading