DALL E sucks

Angelos Alexopoulos
13 min readAug 10, 2022

A real comparison between Dalle and other Image Generation Algorithms like Dalle e mini, VQGAN, Latent Diffusion and Midjourney)

DALL·E 2 is a new AI system that can create realistic images and art from natural language descriptions. It can combine concepts, attributes, and styles. In January 2021, OpenAI introduced DALL·E. One year later, their newest system, DALL·E 2, generates more realistic and accurate images with 4x greater resolution. We have seen amazing, really impressive images generated by DALLE 2.

When OpenAI accepted my request to access DALL E 2 I was super excited and eager to try the image generation capabilities. I have a long history of using Image Generation technologies like GANs (DC GANs, SR GANs) or Dall-E alternatives (Craiyon former Dalle mini, Dalle flow, VQGAN, NeuroGen). But when I saw the results for Dalle 2 I was really amazed!

There are a lot of discussions about OpenAIs practices. Some people argue how this company can be named “Open” AI when all its systems and algorithms are closed. I quite agree with those voices that expect more transparency in the field. However, I have a feeling that it's harder than we think. Image generation algorithms typically need many training images and huge processing power to train. Those requirements make it hard for the simple researcher to recreate or adjust existing algorithms. Furthermore, we have to think that we have imposed a set of limitations to the generative power of those algorithms e.g. generating nudes or perverted or distorted images. Lastly, a lot of people think that image generation algorithms should also take care of stereotypes we have about sexes and races. We can see that the topic is more complex than most people think but it's a good discussion subject that we should keep doing without restricting further research.

To make a good comparison between Dalle Alternatives and Dalle, I will be using a nice Youtube Video created for Tomorrowland festival’s Love Tomorrow Conference in July 2022. What a wonderful video!

I have broken the video image by image and used DALL E with the same text prompt so that we can compare outputs with exactly the same inputs.

1. painting, bowl of soup floating in space surrounded by galaxies and stars, John Atkinson Grimshaw

Here we ask our AI to generate a painting with a style similar to John Atkinson Grimshaw. John Atkinson Grimshaw (6 September 1836–13 October 1893) was an English Victorian-era artist best known for his nocturnal scenes of urban landscapes. He was called a “remarkable and imaginative painter” by the critic and historian Christopher Wood in Victorian Painting (1999). Some examples of his paintings can be found below:

Let’s see what dalle mini generates:

DALL E mini
DALL E2

Comparison Result: Equal

2. galaxies and stars in space, John Atkinson Grimshaw

Again we ask about an image inspired by John Atkinson’s work

DALL E mini
DALL E2

Comparison Result: DALL E2

3. oil painting, atoms dancing in a bowl of soup in space

This is an interesting experiment since we do not specify a specific style. Again we compare DALL E mini with DALLE 2.

DALL E mini
DALL E2

Comparison Result: DALL E (I was inclined to give a tie here but I think that DALLE 2 is showing the dancing atoms clearer)

4. painting, then came the trees, John Atkinson Grimshaw

This input again uses Atkinsons style, however, the input is really abstract.

DALL E mini
DALL E2

Comparison Result: DALL E mini. Another hard decision here since I think DALL E2 results have more details and better resolution but the style of the mini matches really well with the whole mood of the text and video.

5. picture, prehistoric trees on fire, Henri Rousseau

Let’s change painter and ask some Henri Rousseau style

Henri Julien Félix Rousseau (21 May 1844–2 September 1910) was a French post-impressionist painter in the Naïve or Primitive manner. He was also known as Le Douanier (the customs officer), a humorous description of his occupation as a toll and tax collector. He started painting seriously in his early forties; by age 49, he retired from his job to work on his art full-time.

Ridiculed during his lifetime by critics, he came to be recognized as a self-taught genius whose works are of high artistic quality. Rousseau’s work exerted an extensive influence on several generations of avant-garde artists

Henri Rousseau paintings

In this case, we will compare VQGAN with DALLE2. VQGAN is a generative adversarial neural network that is good at generating images that look similar to others.

VQGAN
DALL E 2

Comparison Result: Equal. If I have to choose I would probably select VQGAN though.

6. painting, clever apes making fire, Alexej von Jawlensky

This time we will use a new painter named Alexej von Jawlensky. Alexej Georgewitsch von Jawlensky (13 March 1864–15 March 1941), surname also spelt as Yavlensky, was a Russian expressionist painter active in Germany. He was a key member of the New Munich Artist’s Association (Neue Künstlervereinigung München), Der Blaue Reiter (The Blue Rider) group and later the Die Blaue Vier (The Blue Four).

Alexej von Jawlensky paintings
VQGAN
DALL E2

Comparison Result: Dall E2.

7. painting, clever apes making fire, Henri Rousseau

Interestingly let’s see the same input from a different painting style perspective.

VQGAN
DALL E2

Comparison Result: Dall E2.

8. painting, clever apes making fire, John Atkinson Grimshaw

And lastly the same input from Atkinson

VQGAN
DALLE2

Comparison Result: Dall E2. (I have to note how important is to note the painting style each time. The difference between the last three images is big and distinctive)

9. painting, sunlight kissing leaves gave birth to trees, Jean-Francois Millet

Jean-François Millet (4 October 1814–20 January 1875) was a French artist and one of the founders of the Barbizon school in rural France. Millet is noted for his paintings of peasant farmers and can be categorized as part of the Realism art movement. Toward the end of his career he became increasingly interested in painting pure landscapes. He is known best for his oil paintings but is also noted for his pastels, conte crayon drawings, and etchings. Examples of his paintings are shown below:

Millet Paintings

In this input, we will compare Latent Diffusion Generative Model with DALL E 2

Latent Diffusion

Comparison Result: Dall E2. Blown up!

10. painting, sunlight kissing leaves gave birth to trees

Let’s try the same input as above without mentioning any specific painting style. Here we will compare PixRay with DALL E2

PIXRAY
DALL E2

Comparison Result: PixRay. Dall E2 results are ok but I prefer the pixray results where kissing action is more obvious.

11. painting, trees kissing firelight gave birth to fields of ash

PIXRAY
DALLE2

Comparison Result: PixRay. Dall E2 results are ok but I prefer the pixray results where ash is more obvious.

12. painting, thousand-year old forests were felled for heat and light

PIXRAY
DALL E 2

Comparison Result: PixRay. Dall E2 results are ok but I prefer the pixray results where fallen trees are obvious.

13. painting, across the known world, band of humans sit by fires and rejoiced

PIXRAY
DALL E 2

Comparison Result: PixRay.

14. painting, humans celebrating another day of survival

PIXRAY
DALLE2

Comparison Result: PixRay.

15. painting, we cut down the trees for fuel, Arnold Bocklin

Arnold Böcklin (16 October 1827–16 January 1901) was a Swiss symbolist painter. Influenced by Romanticism, Böcklin’s symbolist use of imagery derived from mythology and legend often overlapped with the aesthetic of the Pre-Raphaelites. Many of his paintings are imaginative interpretations of the classical world, or portray mythological subjects in settings involving classical architecture, often allegorically exploring death and mortality in the context of a strange, fantasy world.

Böcklin is best known for his five versions (painted 1880 to 1886) of the Isle of the Dead, which partly evokes the English Cemetery, Florence, which was close to his studio and where his baby daughter Maria had been buried. An early version of the painting was commissioned by a Madame Berna, a widow who wanted a painting with a dreamlike atmosphere.

Arnold Paintings
VQGAN
DALL E2

Comparison Result: VQGAN seems to implement the fuel part better

16. painting, a road made of dead trees in the morning light, Richard Dadd

Richard Dadd (1 August 1817–7 January 1886) was an English painter of the Victorian era, noted for his depictions of fairies and other supernatural subjects, Orientalist scenes, and enigmatic genre scenes, rendered with obsessively minuscule detail. Most of the works for which he is best known were created while he was a patient in Bethlem and Broadmoor hospitals.

VQGAN
DALL E2

Comparison Result: VQGAN. DALL E2 doesn’t like ‘dead’ word even if we talk about ‘dead trees’.

17. painting, caveman lit a fire under organic soup in the forest. things began to bubble, Henri Rousseau

VQGAN
DALL E2

Comparison Result: DALL E2

18. painting, a city made of dead trees, Dorothea Tanning

Dorothea Margaret Tanning (25 August 1910–31 January 2012) was an American painter, printmaker, sculptor, writer, and poet. Her early work was influenced by Surrealism.

Dorothea Paintings
Latent Diffussion

Comparison Result: Latent Diffusion (No Dalle available due to dead word)

19. painting, a city made of dead trees, Arnold Bocklin

VQGAN

Comparison Result: VQGAN(No Dalle available due to dead word)

20. painting, they gave their bodies to build our cities, Wayne Barlowe

Wayne Barlowe is a world-renowned science fiction and fantasy author and artist who has created images for books, film and galleries and written novels, screenplays and a number of art books. After attending Cooper Union he started his career painting hundreds of paperback covers for all of the major publishers and magazine illustrations for LIFE, TIME and NEWSWEEK.

Latent Diffusion

Comparison Result: Latent Diffusion

21. painting, we wrote our histories, philosophies and shopping lists on their pulpy flesh, Wayne Barlowe

Latent Diffusion

Comparison Result: Latent Diffusion (No Dalle available due to flesh word)

22. painting, but this was not sufficient, Wayne Barlowe

Latent Diffusion

Comparison Result: Latent Diffusion

23. painting, we cut down the trees for fuel, Wayne Barlowe

VQGAN

Comparison Result: DALL E2 (mainly because of crispier images)

24. painting, trees crushed and compressed for millions of years in the pressure cooker of the earth, Tom Thomson

Thomas John Thomson (August 5, 1877 — July 8, 1917) was a Canadian artist active in the early 20th century. During his short career, he produced roughly 400 oil sketches on small wood panels and approximately 50 larger works on canvas. His works consist almost entirely of landscapes, depicting trees, skies, lakes, and rivers. He used broad brush strokes and a liberal application of paint to capture the beauty and colour of the Ontario landscape. Thomson’s accidental death by drowning at 39 shortly before the founding of the Group of Seven is seen as a tragedy for Canadian art.

Tom Thomson paintings
VQGAN
DALL E2

Comparison Result: DALL E2

25. painting, extinct trees transformed to coal, Dorothea Tanning, dark and smoky

VQGAN

Comparison Result: VQGAN

26. painting, smoky clouds over crowded tenements, Hubert Robert

Hubert Robert (22 May 1733–15 April 1808) was a French painter, noted for his landscape paintings and capriccio, or semi-fictitious picturesque depictions of ruins in Italy and of France.

VQGAN
DALL E 2

Comparison Result: VQGAN (Where is the crowd in dall-e?)

27. painting, pickaxes scarred the earth, Zdzislaw Beksinski

Zdzisław Beksiński (24 February 1929–21 February 2005) was a Polish painter, photographer, and sculptor, specializing in the field of dystopian surrealism.

Beksiński made his paintings and drawings in what he called either a Baroque or a Gothic manner. His creations were made mainly in two periods. The first period of work is generally considered to contain expressionistic color, with a strong style of “utopian realism” and surreal architecture, like a doomsday scenario. The second period contained more abstract style, with the main features of formalism.

Beksiński was stabbed to death at his Warsaw apartment in February 2005, by a 19-year-old acquaintance from Wołomin, reportedly because he refused to lend him money

Latent Diffusion
DALL E

Comparison Result: Latent Diffusion

28. painting, black dust scarred the lungs, Zdzislaw Beksinski

Latent Diffusion
Dall e2

Comparison Result: Equal (but different)

29. painting, forges spat out locomotives, Zdzislaw Beksinski

Latent Diffusion
DALL E

Comparison Result: Latent Diffusion

30. painting, railways demolished distance, Dorothea Tanning

Here we use midjourney algorithms. Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.

DALL E2

Comparison Result: Midjourney

I think we got a pretty good idea of the differences between the different Image generatives models. Let’s summarize the winners:

DALL-E 2: 12

Latent Diffusion: 7

VQGAN: 6

PIXRAY: 5

DALL-E Mini: 2

Midjourney: 1

It is obvious that DALL E2 is the best of all the other techniques. However it seems that case by case we can take really good results from other methods also. Only 12 out of 30 cases are much better results than the other techniques.

Do you agree with the above?

How would you compare the different generated images?

Would you like me to add here also the rest of the images? Currently, I have processed only 2.5 mins of 10 mins which means that we can have plenty of images to compare.

Thanks and cheers!

--

--