Variety of Wildcard Images Created Via Stable Diffusion

Making Wild Images With Stable Diffusion

Eric Richards

--

Since my last post, I’ve been indulging in generating thousands and thousands of Stable Diffusion images while playing around with the wildcard extension coupled with the X/Y/Z plot script. I had started this journey in a previous post:

I’m going to cover:

Part One (here)

  • Why I started down this path.
  • The technical setup for doing it.

Part Two

  • Insights and discoveries I made for myself.

As we go through this post and especially into the second part, I think you’ll see the images I add radically change and (I think) improve.

But Why? Why Did I Need the Wildcard Extension?

Where I was: I felt like I was in a rut. I was generating the same images from the same artists getting about the same results over and over again. So I wanted to shake it up. I thought a little directed randomness would be a good step.

And what kind of images do I like to generate? For me, it’s generating images that look like they are illustrating a story. I like to call them Untold Tales over where I publish my images on Instagram:

Fantasy Art of Dark and Light — Rufus the Ruse (@rufustheruse.art) | Instagram

Various Images Shared On Instagram

I like the image to look like it’s happening in the middle of a story, letting you ponder not only what’s happened up to that point but what’s going to happen next. Now, I’m not above becoming enamored with a very interesting subject just posing there perhaps with vivid surroundings and lighting. But usually I look for something fantastical. And I needed a boost to get me beyond the same old same old that I was generating.

Also: I abuse the ever-loving-length of my positive and negative prompts. Horribly. I rely on Automatic1111’s interface to Stable Diffusion to make it work beyond the 75 token limit by doing the blending it does.

Some people are short-prompt heroes.

I am a long prompt villain.

But How To Shake It Up? The Mechanics.

Wildcards

To enable the wildcard extension in Automatic1111, do the following:

  • Navigate to the Extensions tab.
  • Go to the Available sub-tab.
  • Select Load from: button.
  • Install stable-diffusion-webui-wildcards.
  • Select Apply and restart UI

The wildcard extension to Stable Diffusion certainly adds the randomness I was looking for to shake things up. Once enabled, you can fill a text file with whatever lines you’d like to be randomly chosen from and inserted into your prompt. For instance, I have a file in my wildcard directory (stable-diffusion-webui\extensions\stable-diffusion-webui-wildcards\wildcards) called fantasyArtist.txt — it lists all the fantasy artists I’ve generated interesting results for in the past. Within my prompt, I can have a random line chosen from this file (and thus a random artist) by putting double underscores around the filename (__fantasyArtist__). Note the file extension being dropped.

Now then, you’re no doubt far more creative than I am and can immediately see how this wildcard, used repeatedly in a prompt, can refer to all sorts of random lines out of all sorts of random files. Like the word game MadLibs? You can basically construct a prompt that is a rough template full of wildcards and then set up a large batch count job for your favorite Stable Diffusion model and see what kind of random results you get.

See something you like, pick apart what you like about it and then use that as a more refined template.

Wildcard Files

Need inspiration? A couple of set of example wildcards are here — if you’re not into using GitHub directly you can just download as a zip and move over whatever files here you find useful into your wildcards directory.

If you’re into using an AI chatbot, you can also invest time in explaining Stable Diffusion to it and how prompts work and then use it to generate lines of text for you that you can put into a wild card file. I’ve done this to generate abstract locations and even to generate quotes from books or song lyrics to add some atmospheric randomness to my prompt.

But I’m getting ahead of myself.

Here’s an example early wildcard prompt I started with:

Epic __EmbPrimeArtists__ [detailed color pencil:photo shoot:0.3] by __fantasyArtist__ and (__classicArtists__:0.5) and (__classicArtists__:0.5), HQ, 8K, hyper detailed, The __female__ surrounded by __dreamyThings__ (face looking __expression__:1.5), __keyword__, __keyword__, __keyword__, __timeOfDay__, timeless realization of the facts of life and our time is limited on this earth, Action, cinematic dramatic lighting, bokeh, shot on Canon 5D,masterpiece [oil painting:hyperrealism:0.3] in the style of (__compositionArtists++:0.5)

Taking this apart:

  • EmbPrimeArtists.txt contains a list of textual inversions representing artist styles that I like.
  • fantasyArtist.txt contains the list of fantasy style artists that I like.
  • classicArtists.txt contains the list of non-genre high quality artists that I like.
  • Female.txt contains a list of ladies in various kind of situations and dress — detective, warrior, goddess, etc etc.
  • dreamyThings.txt contains a list of odd dreamy objects to surround the subject with. Penguins work well. Too well.
  • Expression.txt is key to me — it contains various facial expressions. This is important for what I do in that it avoids dead, bored looking faces. I’d rather have a little smile or full-on wicked grin than a vacuous stare.
  • Keyword.txt is interesting — I went through one of the sites — Stable Diffusion V1 Modifier Studies | Gallery View (notion.site) — showing all the keywords that affect Stable Diffusion and collected the ones I like. Some of these can have a huge impact to the rendering of the image and come up with results I never would have stumbled across on my own (well, not anytime soon).
  • timeOfDay.txt — contains various descriptions for the time of day (like Golden Hour or Sunrise).
  • compositionArtists.txt contains the list of all the artists I found to have a major effect in the composition of the image. A lot of my previous posts cover these kind of artists, like say Mike Mignola.

But that prompt above has an error. Can you see it?

I have __compositionArtists++ instead of __compositionArtists__. Whoopsee. It took me a bit to find that. If you refer to a file that doesn’t exist as a wildcard, it will be noted as a warning when the job begins but won’t prevent the job from running. But something like my typo would not be flagged.

After I fixed the error, I found I didn’t like the resulting images as much as when it was running with the bogus error.

So that’s how it started. I’d create wildcard files and then create prompts referring to those files and kick off large batch runs to generative dozens — or hundreds — of images.

And it changed over time in some ways and in some ways not. A lot changed as to what went into those files. Those dreamy penguins got deleted real fast. The files are read dynamically as your perhaps 100 image job runs, so if you have ideas to revise, add, or delete text in the file you can do so while the job is running.

As I looked through the above GitHubs of example wildcard files, I would pick and choose files I found interesting or useful, editing them as needed. Some of the files I’d find useful over time:

  • Artist-csv.txt — a very long file containing a list of many artists. I eventually used this to discover new artists that had appreciable images in Stable Diffusion.
  • Photo_Camera.txt — a list of various camera models to use in the prompt to say what kind of camera took the image.
  • Photo_angle.txt — really useful for basing the image at different angles.
  • Medium.txt — what kind of medium is used in creating the image? Watercolor, oil, acrylic, etc.
  • Technique.txt — a list of visual aspects, like lens flare or bokeh or halftone.

Here’s an evolved prompt that looks same and different:

Epic realistic [detailed color pencil:photo shoot:0.3] by __artist-csv2__ and (__fantasyArtist__:0.5) and (__classicArtists__:0.5) The beautiful __keyword__ __female__ surrounded by __keyword__ __dreamyThings__ (face looking __expression__:1.2), __keyword__, __keyword__, __keyword__, __timeOfDay__, __locAbstract__, action shot hero pose, Action, cinematic dramatic lighting, dark, high contrast, bokeh, HQ, 8K, HDR, hyper detailed, __photo_angle__, photograph shot on __photo_camera__, masterpiece [oil painting:hyperrealism:0.3] photo in the style of (__artist_photographer__:0.5)

A Bit More Interesting Set of Wildcard Run Images

In that prompt, artist-csv2.txt is a modified version of the artists file where I’ve added some artists and embeddings. For the ones I haven’t discussed yet, you can discern the wildcard’s meaning.

A later prompt to mix things up, aiming to be highly detailed if not photo-realistic:

__leadDescription__ 16k, 8k, 4k, 4k UHD, ultra HDR, perfect quality, insane quality, extreme quality, intricate, ultra quality, super quality, perfect detail, very high detail, insanely detailed, extremely detailed, intricate detail, ultra detail, super detail, perfect resolution, very high resolution, insane resolution, (cinematic shot:1.6), depth of field, rule of thirds, ((__medium__)) , (masterpiece art by __artist-csv2__ ) and (__classicArtists__:0.25) of __poemLine__ __locAbstract__ beautiful __female__ (face looking __expression__:1.3) surrounded by __dreamyThings__ while __songLyricLine__, __timeOfDay__, __keyword__, __keyword__, __keyword__, cinematic dramatic lighting, bokeh, HQ, sharp focus, __photo_angle__, DSLR photo, __technique__

Better Images With More Variety In Style and Composition

Note here I’ve added some stuffing bringing in poemLine.txt and songLyricLine.txt (I rarely used them together). This stuffing helps in the randomness. For the poem file, it’s the first line from major well known poems. The song lyrics are major lyric lines from significant pop songs. Most of the content is from ChatGPT with editing for my own tastes.

Side note: Stable Diffusion doesn’t care about grammar or English sentence structure. It cares about words and where they appear in the prompt. Additionally, note that Automatic1111 does prompt token magic to group tokens together once you’re over the 75 token limit.

While you can select a model and then run a large batch run, I was downloading models like crazy from Civitai to test out the new models. Like I mentioned in my previous post, I changed the script dropdown from None to Plot X/Y/Z and did the following:

  • For X, I chose CFG Scale and entered something like the following to go from 5 to 10 in increments of 0.25: 5.0–10.0(+0.25)
  • For Y, I chose Checkpoint name and then limited it just to the checkpoints I wanted to run against.
  • Made damn sure that Batch count was set to one and that Keep -1 for seeds was checked.

Now, select Generate.

I check the beginning of the output in the command prompt where Stable Diffusion is running to make sure nothing crazy is happening. For instance, I expect a few hundred images. Not twelve-thousand. I also make sure that there’s no complaints about missing wildcard files indicating that I screwed up the prompt.

And then I come back hours later to triage. As I’ll discuss later, triaging a few hundred images is very, very important.

What Models?

I put the models I’m currently using for image generation into their own subdirectory in the models\Stable-diffusion directory. I’ve used many many different models over the past few months, some allowing more artist mediums and some limited to photo-realistic.

What I’m currently using (linking to CivitAI — these might be posted elsewhere):

What LoRAs?

I think LoRAs are great, I just haven’t been using them much so far. Basically, imagine extracting out of a new model what makes it unique and using that with the models you have. It’s a lot smaller (hundreds of megabytes instead of multiple-gigabytes).

There is one that I’ve started using, however. If you don’t feel like diving into LoRAs and using them don’t worry about. But given that I’m usually disappointed with how bright Stable Diffusion output is, the EPI Noise Offset LoRA can be added to your prompt to darken the output.

If you do that with a model that already has the noise offset mixed into the model (like Lyriel) you’re going to get double-dark results. FYI.

What Negative Prompt

The negative prompt can be just as important as the prompt itself. My current negative prompt is re-dunk-alicious. Hold onto your cap:

cleavage, (close-up:1.6), boring, face paint, face jewelry, portrait, mutated, front-facing, blue eyes, anime, UnrealisticDream.pt, BadDream.pt, (easynegative.pt:1.0),(bad-hands-5:1.0), (ng_deepnegative_v1_75t.pt:1.0),((nude)),((naked)),((sexy)),((nsfw)), face paint, cartoon, animated, toy, figurine, frame, framed, perfect skin, malformed sword, (low quality, worst quality:1.3), FastNegativeEmbedding.pt, glowing breasts

It comes in at 430+ tokens. A lot of that due to negative embeddings — textual inversions — that I’m using. You can find these textual inversions on CivitAI. Additionally, you can see that I strive for G / PG output, which succeeds 90% of the time.

And from all of this, I’ve learned some things. I’ll cover that in the next post, part two:

--

--

Eric Richards

Technorati of Leisure. Ex-software leadership Microsoft (Office, Windows, HoloLens), Intel Supercomputers, and Axon. https://www.instagram.com/rufustheruse.art