Two ComfyUI Additions Useful for Stable Diffusion AI Art
With SDXL, there’s a lot going on in the ComfyUI world. There are many nodes and complex workflows.
There are two things I’ve added to my basic SDXL workflow lately that have improved my results:
- The “Wrong” SDXL LoRA. Less-bad hands.
- The Ultimate SD Upscaler custom node.
I have a link to a PNG image with my embedded flow later in this article.
A Wrong to Make It Right
The Wrong LoRA can be found here: SDXL Wrong LoRA — v1.0-Diffusers | Stable Diffusion LoRA | Civitai
There’s a good write up about the process of creating this LoRA here: I Made Stable Diffusion XL Smarter by Finetuning it on Bad AI-Generated Images | Max Woolf’s Blog (minimaxir.com)
You add the LoRA into your model flow and then add the word “wrong” to your negative prompt.
Snippet of my flow:
And just to be clear: I’m adding three LoRAs to my model flow here: the “Wrong” LoRA we’re discussing, one to make things look more artful, and the Stability AI offset LoRA. You can just do the “Wrong” loRA. The last in the flow chain gets connected to everything else.
Does it fix hands? No, it doesn’t fix hands, but I see a lot less splayed hands that look like exploded bananas. Hands are still really bad in SDXL, which I talked about on Reddit a bit: Hands in SDXL — Some Thoughts (Stability AI Proposal) : StableDiffusion (reddit.com) .
I suggest experimenting it with various strengths and see if you notice fewer wrong things in your batches of images.
Adding Upscaling to Your Flow
My last flow in Automatic1111 had upscaling — it definitely improved the image to have it upscaled at least 1.5 times. My first attempt at using the Ultimate SD Upscaler flamed out really poorly. I fixed it, with a bit of education. So let’s go over the two steps I recommend here:
One: install it via the ComfyUI Manager. This manages which custom nodes you have installed. You can read the last post here regarding setting that up:
Two: watch Scott Detweiler’s video about creating an upscaling workflow with this custom node:
The two techniques I picked up from this video:
- For SDXL, set the tile size to 1024, since that’s its main trained resolution.
- Hook-up empty positive / negative prompts to the upscaler unless you really want to tell it to do something.
My first attempts, even at a low denoise value, had the tiles very obvious — the scene was about right, but each 512x512 tile was off a little bit. E.g., ocean waves would be calm in one tile and then more prevalent in another. Also, given I was re-using my positive and negative prompts, new things would show up in the image, like little people. I don’t want new little people sprinkled around in my upscale.
Given the changes I learned from Scott’s video, it works well for me. Sort of out of context, but here’s a clip of my upscaler:
I probably need to play around with the denoiser and the upscaling model some. The faces do change a bit and some nice skin tone and wrinkles get removed, which I personally do not like. In general, the upscaling is a better image.
Workflow? I’ve added a PNG to my GitHub with the workflow that I’m using. You can grab this PNG and drag it into ComfyUI to see my setup. No, it’s not pretty.
Note that you’ll need to go and fix-up the models being loaded to match your models / location plus the LoRAs. I do load the FP16 VAE off of CivitAI. You can use the ComfyUI Manager to resolve any red nodes you have. But beware. Keep reading.
Automatic1111 and ComfyUI Thoughts
Supposedly work is being done to make A1111 have better memory usage. I can’t use it right now because my 8GB card doesn’t provide enough memory.
Does that mean I’ll be able to run a similar setup in A1111 that I run in ComfyUI? On my 8GB video card? Doubtful, but maybe. 12GB is becoming the new minimum. And for how long?
I prefer A1111 because of the power of its prompting. I don’t mind getting my hands dirty with lots of nodes but it’s way better to have an expressive prompt like what A1111 provides, with additional extensions along the way, vs. pulling in random custom nodes off of GitHub to accomplish something that’s like but not exactly like what A1111 provides.
My main concern: the security threat model of ComfyUI. For custom nodes, you’re pulling in straight Python code. Code that can do about anything it wants. I don’t see any attempt towards sandboxing / security in the extensibility model here. Boy, I’d like to be wrong and be strongly corrected that there is a trustworthy security model. I’ve looked through the code. The risk is pulling in some random snazzy node that has foul, malware intentions. A target audience with powerful graphics cards? That’s alluring.
Let alone the concerns of supply chain / typo-squatting attacks. I’ve gone through discussions in some of the custom nodes GitHubs and some of the authors admit they don’t really know what they’re doing in Python, which opens things up to inadvertent calamity. Let alone future bad actors taking over popular nodes.
While extensions are nice in A1111, custom nodes are essential in ComfyUI, especially if you’re downloading what appears to be a snazzy workflow from someone else.
So beware the red nodes being fixed by bringing in new code. Be sure that you trust the source. Security is on you for whatever custom nodes you decide to bring in and run code on your machine.
I’d be much happier if ComfyUI had a sandboxed version of the nodes, or at least an option to attempt that, knowing that some nodes may not run.
So if any day comes that I can run A1111 again, I don’t believe I’ll hesitate. ComfyUI has been a relief to have, but I think it’s just waiting to burn me.