AI Masterpieces and Mishaps: Portrait of Varin Felwing

In preparation for the release of Flicker, Chapter 10, I asked OpenAI’s DALL-E 2 to help me generate images that reflected the chapter’s narrative, vibes, and characters.

Chapter 10 takes place in Solin Felwing’s past, and includes yet another conflict with his older brother, Varin. A villain in Solin’s early life, Varin made his on-page debut in Chapter 5.

After some failed attempts to re-create the opening scene in Chapter 10, I decided to try a prompt for Varin. In addition to some physical descriptors and asking for a specific age range (the Drakons are teens in this chapter), along with including a sword and leather armor (so as not to get contemporary clothing), I included some character traits, such as “mean” and “hard eyes” and “snarls at you.”

Here are the four images that DALL-E 2 generated for me:

Image A

A close-up image of a man with light brown skin and dark brown-black hair, which is short and hangs over his forehead. He is looking at the viewer with intention and slight ferocity, and holds a shiny silver-and-gold object that looks swordlike, although only his face and the lower blade are in the frame. This was supposed to be a generated photo of Varin Felwing at a specific time of life, but there are small details that aren't quite right. Generated by DALL-E 2.

Here, DALL-E attempted to include the detail of the sword but seems to have ignored the armor. It also paid attention to the fact that I wanted Varin to have that “I’m gonna attack you” stance. No trace of a snarl. Age is a little older than Varin is in the scene. Alas, I was also disappointed in myself because despite knowing that AI images have inherent biases (as a result of their training materials having biases), I’d forgotten to be extremely clear about Varin’s complexion, so this fella isn’t quite fitting the bill.

(If you aren’t familiar with the biases of AI, definitely take a gander on the interwebs at more legitimate sites about how that arises. Essentially, because society is laden with -isms both overt and subconscious, the images people put out there tend to be in line with societal values and trends. Some of these values and trends are harmful, and some aren’t harmful but perhaps annoying because of oversaturation. At some point, I may have time to share an example of the harmless, but annoying trends side of this too. For another chapter, I tried to get a simple photo of Solin pouring Scotch whisky into a glass. The amount of dudes in red flannels was astounding, but not surprising, given recent marketing trends for craft beers and spirits to feature flannelled dudes.)

Image B

A mid-shot of a man with brown skin and dark brown-black hair, which is shorter and is swept airily backward. He is looking at the viewer with intention and slight ferocity, and he holds a shiny silver object across his body as though ready to attack with it. The object has a dagger-like shape, but where the blade connects to the handle looks jumbled and somewhat fragmented. This was supposed to be a generated photo of Varin Felwing at a specific time of life, but there are small details that aren't quite right, such as different eye colors, odd shoulder proportions, and the weird dagger, to name a few. Generated by DALL-E 2.

Look- and vibes-wise, this generation gives me more of Varin’s antagonistic character, and we even get more of the body in the frame too. Age-wise, it seems to work better for the scene. Unfortunately, there are just some general image hiccups with the details. This image also appears to put Varin in an outfit that is half modern t-shirt, half tunic, and the blade is not a sword, but a long dagger that has crystalline fragments that appear to meld into Varin’s chest.

Image C

A close-up image of a digitally rendered 3D man with light brown skin and brown hair that hangs across part of his face. He looks like he is from an early generation of a video game console that supported 3D games, with some details smooth and some of them somewhat cartoon-like. His head is tilted, his eyes are a reddish-brown, and his eyebrows are very thin and somewhat curved. He has an otherwise blank look. An object that could be the hilt of a sword is in the corner of the frame. This was supposed to be a generated photo of Varin Felwing at a specific time of life, but some details are odd, including a strange reflection in one of his eyes. Generated by DALL-E 2.

What even is this? The sword is barely in frame, his look is totally blank, and he looks like a video game character from early 3D consoles. When I saw this, I realized my prompt had failed to include specifics about the type of art, which is why DALL-E 2 mixed this 3D character into what otherwise looks like a batch of photos. Again, we also see the inherent biases at play too.

Image D

A close-up image of a man with pale brown skin and black hair, which is straight but textured to different lengths, giving him a punk-rock look despite it not being styled heavily. which is short and hangs over his forehead. His head is slightly tilted, and he has an intent look with a "Mona Lisa" smile. Neither his clothing or his silvery sword can be seen it their entirety, but his coat has a high collar that is reminiscent of designs based on naval coats, with multiple studs or buttons lining the collar. This was supposed to be a generated photo of Varin Felwing at a specific time of life, but there are small details that aren't quite right, including his right eye, which appears to have a half-colored iris that is blue instead of brown like his left eye. The photo is otherwise very realistic, but other details simply do not fit Varin's actual likeness as described in The Fire of Felwing series. Generated by DALL-E 2.

With this character, DALL-E 2 didn’t seem to interpret the age or personality traits correctly at all. We have an adult male with a Mona Lisa smile who appears to be having his actor headshot taken. We get the sword, sort of, although it appears like the sword is just floating in front of him. And while we don’t see much of the armor, it’s clear that there is something more fantastical about his outfit.

Masterpieces or Mishaps?

Unfortunately, I can’t say DALL-E 2 met my expectations of re-producing Varin with any of these images. I believe all three photo-realistic images could possibly have uses elsewhere with a little cropping or a few touch-ups, but they are not Varin Felwing.

These images bring up so many questions about how to train AI to not only generate according to specifications entered by prompters, but to create images free of societal biases. Let’s be direct here: the figure who had the darker complexion also happened to be the one who generated with the proper personality characteristics and age from the prompt. While this works for Varin Felwing specifically, the AI largely ignored these characteristics in my prompt with the subjects that generated with lighter complexions, which could reflect inherent colorism in the training materials, which in themselves are reflections of societal values and beauty trends.

And let’s be direct again: the dude in Image D, by contemporary beauty standards, is photogenic as hell. This is in part due to the compositions of both his pose and the image itself. But does this suggest that a hefty majority of well-posed and well-composed images DALL-E 2 was trained on included people with paler complexions? Or is my prompt an outlier, a coincidence in which the “bad” personality traits I wanted present on all alternates just happened to be more pronounced on the subject with the darker complexion, which also happens to align with racist and xenophobic representations?

It is impossible to know for sure, at least on our ends with this particular prompt and these generations, but it remains likely given the way this type of AI works. These are questions we must continue asking ourselves as prompt engineers and of the AI and AI developers. That said, I remain hopeful for DALL-E’s continued development, as part of OpenAI’s mission is to enhance the safety of AI and reduce bias in interactions.

A Resource

It’s clear DALL-E has a ways to go before it can replicate individuals it produces, given that it still fails to read and respond to prompts it is given. Let’s say I had decided to go with Image B for Varin; I could not re-create Varin using DALL-E. The AI is not currently capable of making that same figure re-appear. It could not age up the figure for a future chapter or pose the figure differently for a different scene. I could delete, for instance, the weird dagger and ask DALL-E to fill in the blank canvas, but it will still only be able to work with that small area and my prompt. That version of Varin only exists in that image.

I think this creates an opportunity for artists to use images generated by AI like DALL-E as references. Human artists already use references of pre-existing works and models, and artists also learn via direct or stylistic reproductions. This is one way, I think, that human artists can take AI generations to the next level, using the AI like a resource to supplement and support their work, as well as helping clients better articulate any requests.

Until next time.


Leave a comment below. Comments will be moderated in accordance with site policies.

Browse Latest Updates By…