Artful AI Visions of My Sweetheart
For this post I have:
- Interesting resources I’ve seen lately.
- Notes on rendering my wife’s image in Stable Diffusion, between DreamBooth and embedded textual inversions.
Recent Resources
Starting out: okay, anyone creating Stable Diffusion art should totally read this one. It’s a visual book that goes through the various aspects of crafting a prompt to create your image: Stable Diffusion Prompt Book — OpenArt | OpenArt
Prompt crafting: best technique tip I’ve read this month that has allowed me to create incredible images: Carne-vall style. A few pointers on how to use Carne Griffiths without making things go *too* crazy. : StableDiffusion (reddit.com) . I used the insights here for creating some incredible images of my wife. I just love the results.
Workflow with Stable Diffusion: video of interesting technique: CYBERPUNK SCENE: Entire process of creating a cyberpunk scene using Stable Diffusion | Timelapse. It was a revelation seeing it go from Stable Diffusion to Blender and then into Stable Diffusion and in-and-out of other tools to composite a scene with four cyberpunk guys hanging out. The part I really like was the initial posing of 3D mannequins with rough painting over them and then using img2img to create a bunch of characters.
Meta Workflow: an artist using Stable Diffusion to create art in his style: The Stable Diffusion is an AI that grants artists absolute freedom, its completely open source. Most people are not ready for it, not ready to be unplugged. Are you listening to me or are you looking at the woman in the red dress, Neo? [Quick art done with SD with the process tutorial!] : StableDiffusion (reddit.com)
Resource: AI Art Resources, Tools & Inspiration For Designers And Prompt Engineers: AIArtApps.com
Inpainting example: Outpainting — sd-v1–5-inpainting model : StableDiffusion (reddit.com) — I’ve done very little with img2img and inpainting and such, so having a practical example is useful to give me inspiration.
After Myself, Train (a model of) My Sweetheart
This past week, after failing to get Dreambooth training working locally at home, I spent time on creating renders of my wife, especially trying to create AI art images of myself and my wife together in a fantastic setting.
At a high level, what I tried:
- Google Colab — ShivamShrirao DreamBooth.
- Google Colab — FastBen DreamBooth.
- A textual inverse embedding. Overtrained.
- A textual inverse embedding. Just right.
From the praise I hear for the Shivam DreamBooth I guess I must be doing it wrong. I almost always get weak results out of it. I gave it another try before switching back to the FastBen DreamBooth Google Colab. I was pretty happy (at least happier) with those results. But since the results weren’t very “Wow!” I went back to using the A1111 embedded textual inversion training.
Much Too Much
The first time, I over did it. I had well over 200 training images, including flips. It was too much. What this resulted in was any mention of the embedding in a prompt causing the output image to always be a photograph and multiple people with my wife’s face appearing. There was zero way to style it.
Less Is More
So I created a second embedding trained with 60 unique images. Now I could style again and not have everything turn into a photograph. I didn’t feel like the image was dominated on the training anymore. One of my most favorite learnings to come off of the Stable Diffusion Reddit has a prompt like the following:
A [handpainted:photo:0.5] artwork by Alfons Mucha and Jeremy Mann of the face YOURTRAININGHERE, wearing glasses, she is centered in the picture, nighttime city, intricate, trending on artstation, highly detailed, [oil painting:Hyperrealism:0.5]. In the style of Carne Griffiths.
Obviously drop in the name of your training at the appropriate place.
While this one reduces the styling of a scene a bit it can still be styled and I’ve had an absolute blast creating wonderful images with the woman I love:
Additionally, as an accidental exercise, I uploaded all of my images of my wife to a Google Photos album, where Google Photos had already been trained on her face from previous photos. In many of the images, Google Photos agreed the face matched its recognition of her when I searched for images of her:
Next — A Return to DreamBooth
I’d much rather use a DreamBooth model because while the last embedding can be styled it still flattens the style of the image. Things look far more interesting with DreamBooth… so I plan to give it one more go now that V1.5 of the model has been out and developers have transitioned to it more and more.
I’m a bit cautious using FastBen again because it’s improving so quickly day by day. So good news but it’s a moving target. My plans are:
- Do the multi-training with my faces and my wife’s faces. I really want to create quality first attempts of myself and my wife together. I find the textual embeddings are fighting each other (e.g., I take on traits of my wife’s face).
- Keep that model around for future iterations within DreamBooth
So, Google Colab willing (because I can’t get it to run locally), I’ll have some new results to cover soon.