Midjourney 5.1 and 5.0 compared: is it a big leap forward?
Midjourney have just released their version 5.1. Why not 6.0? My guess is that it’s because these technologies are moving ahead so fast that they don’t want to find themselves with version 93 in a few years’ time. Apparently it’s more ‘opinionated’ than 5.0 - I’m not sure what that means - but the proof will be in the images themselves.
Let’s try it out.
First prompt: “Photograph of a child wearing a red hat and sunglasses, holding an ice cream, kodak portra, soft outdoor light, —ar 3.4 —v5”
This is 5.0:
Pretty good as we’ve come to expect with version 5.0: it’s hard to tell them apart from actual photos of actual children (I still can’t wrap my head around the fact that these children DO NOT EXIST. It’s a mad world we live in.) Ok, the first one has six fingers but let’s not be too picky.
Now, the same prompt with version 5.1:
Can I tell the difference? Well, in the quality of the images not so much. But in terms of their ‘personality’, definitely. These are just more interesting. The kids aren’t looking so dumb for a start: all the first images have that blank look we’ve come to expect from Midjourney portraits. We have a smile, we have a girl about to drink from a straw (not sure where that one came from but ok), one kid’s ice cream is melting, and their clothes are more funky. The lighting is rendered better too. And look at the hands. LOOK AT THE HANDS. The correct number of fingers finally. I like it!
Let’s try something a bit different.
“A photograph of a landscape with a church in the background, early morning light, large format camera, f32, bracken in the foreground, still water --ar 16:9”
Version 5 first:
Again not bad as we’d expect. Composition is a bit off with the church pushed to the top of the frame, and there’s this random camera in numbers 1 and 4 (very meta, Midjourney), but lighting has been handled pretty well.
Now here’s 5.1:
Again I’m not sure the reason for sticking the church so high up in the frame, but I would say colours here are deeper and it’s realised I didn’t want an actual camera in the shot (I mean why would I?). Generally the image looks more professional in my opinion.
Ok, next one. Something a bit weird to test out its capacity for originality.
“An editorial style image of a steampunk style warehouse filled with people working, lots of energy, light streaming through high windows catching dust in the air, M.C. Escher --ar 4:3”
Version 5 first:
The light is rendered quite nicely and it’s been originally handled. I would say the aesthetic is workhouse circa 1800 rather than steampunk. Number 2 has elements but there isn’t much stylisation. Now let’s look at 5.1:
More style, more colour, more people, more steampunk. What can I say?
Let’s now look at how it handles styles. MJ 5 did this pretty well, able to reflect the styles of various artists and photographers in its output. Let’s ask it to mimic a well known street photographer from the 30s and 40s:
“A monochrome photograph of two street urchins in the style of Henri Cartier Bresson, 1940s --ar 4:3”
First, 5.0:
Now, 5.1
5.0 does a great job but as always falls into the trap of cropping to close up, which HCB rarely did as he took most of his images using a 35mm lens on his Leica, which he said was the closest to how the human eye saw. Most of his images were in medium shot showing contextual background. Which 5.1 does brilliantly. I mean, I am blown away by how much these look like HCB’s style. Look at the backgrounds. So realistic. And the boys themselves look more ‘street urchin-y’, moving away from MJ 5’s tendency to make everyone look beautiful. Bravo Midjourney!
And finally, another setting in Midjourney that you can now alter through the /settings menu is style. This allows the AI to exercise its creativity in varying degrees and can produce some random (if fun) results. Here’s the first prompt with the Style setting dialled all the way up to Very High (super creative):
Plastic ice creams and ice creams in glasses. Why not? Love the shades too.
That’s just a quick rundown of the new MJ. Is it noticeably better than 5.0? In terms of image quality I don’t think it’s a huge leap, as 5.0 was already pretty amazing. But in terms of moving more into the realm of ‘this is how a random human would take the photo’ then yes, we are getting closer, step by step, to a world where AI does most things as well as the best of us, if not better.
But I still can’t get over the fact that these kids aren’t real.