Found my upscaler useful? Support me on ko-fi https://ko-fi.com/bericbone
An upscaling method I've designed that upscales in smaller chunks untill the full resolution is reached, as well as an option to add different prompts for the initial image generation and the upscaler.
The idea is to gradually reinterpret the data as the original image gets upscaled, making for better hand/finger structure and facial clarity for even full-body compositions, as well as extremely detailed skin.
This version is optimized for 8gb of VRAM. If the image will not fully render at 8gb VRAM, try bypassing a few of the last upscalers. If you have a lot of VRAM to work with, try adding in another 0.5 upscaler as the first upscaler. Differnent models can require very different denoise strength, so be sure to adjust those aswell. There are preview images from each upscaling step, so you can see where the denoising needs adjustment. If you want to generate images faster, make sure to unplug the latent cables from the VAE decoders before they go into the image previewers.
For those with lower VRAM, try enabling the tiled VAE and replacing the last VAE decoder with a Tiled VAE decoder. This can also allow you to do even higher resolutions, but from my experience, it comes at a loss of color accuracy. The more tiled VAE decoders, the more loss in color accuracy. There's still some color accuracy loss even for the regular VAE decoding, so if you wanna use as many as I do is up to your preferences and the checkpoints you work with. You can reduce this by using fewer upscalers.
The detail refinement step needs a very low denoise strength. try not to go above 0.2. Might need as low as 0.03 or lower. This step is to add a layer of noise that makes skin look less plastic, and to add clarity.
There's some logic behind why the scaling factors are gradually decreasing which I won't go into it too much. basically, the lower scale factor the more smaller details are being worked on in relation to the denoise strength.
If you want to lower the schedulers from karras back to normal, please be aware that you need to reduce the denoise strength drastically. different schedulers applies noise differently.
ComfyUI custom nodes needed to use this:
https://civitai.com/models/20793/was-node-suite-comfyui
https://civitai.com/models/33192/comfyui-impact-pack
https://civitai.com/models/32342/efficiency-nodes-for-comfyui
ESRGAN (HIGHLY RECOMMENDED! Others might give artifacts!):
https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-general-wdn-x4v3.pth
VAE: https://huggingface.co/stabilityai/sd-vae-ft-mse-original/tree/main
<游戏图标研究所-luxiaoyu>"AI-Game Icon Institute"
The Game Icon Research Institute is recruiting collaborators. If you are interested, you can contact us on DISCORD @luxiaoyu (Developer)
We have prepared 100,000 materials for AI development research on the game interface this time.The large model for the game UI interface is currently undergoing internal testing, and UI interfaces can be generated based on rough sketches. You can use rough wireframes to realize your interface design style. After the internal testing is completed, it will be first tested in the QQ group.
The models of the Game Icon Research Institute have entered into a strategic partnership with C Station, Similar websites are prohibited from copying these models.They are only for personal study and communication purposes. The right to interpret them belongs to civitai & the Icon Research Institute.
exchange group_Discord: https://discord.gg/njBMYJ7mRF
exchange group_QQ群:489141941
Version 2.0 is suitable for creating icons in a 2D style, while Version 3.0 is suitable for creating icons in a 3D style.
The version is not about the newer the better
Warning: I just remembered something, the scripts are recursive so you can just drop your images folder inside and it will process them. On the other hand do not just drop it anywhere and run it, if you were to just drop it at c: and run it, it will look for images EVERYWHERE, It won't damage anything but will create a lot of trash. So remember to run it in it's own folder.
Edit(20230602): you do know it is ok to complain? I used the files from the zip file I uploaded and noticed That i actually fucked up again and missed again a part of the webp command. It has a -o missing declaring the output and a bit to manage the output of dwebp. I have fixed it again and I am sorry.
Edit: I seemed to have fucked up and missed the Webp to PNG in the to PNG script. I have already added it but please redownload. Sorry.
I posted a lora making guide i made a while ago. I normally use a couple of powershell scripts to change the extension of the files and to square images for resizing or upscaling.
I will add two three scripts one to change to png from jpeg, jpg, bmp, avif, webp and gif
The other does that and makes images square adding white bars on top and bottom or at the sides of the original image.
The final one simply extracts gifs to a bunch of pngs.
To use put the images into the script folder and do a left click at the .ps1 file and click run with powershell.
The scripts doesn't modify the original files just create new ones. The new format swapped files should end with "from_jpg" or whatever extension they were before. The resized files should be Resized + a six digit number.
Here's how to run it.
Iwami Manaka from Mahiru Shiina (Otonari no Tenshi-sama ni Itsu no Ma ni ka Dame Ningen ni Sareteita Ken) and Akane Kurokawa (Oshi no Ko) voice, trained on RTX 3080 for 20,001 epochs, it takes around 1.1 days to complete with PyTorch 2.0 and latest Nvidia driver that optimised for AI.
The dataset mostly from Otonari no Tenshi-sama, Oshi no Ko status is on-going anime which is incomplete part of her voice. Listen her voice here!
More revision coming soon for better inference results!
Update, v1.2, June 1: moved metadata to external files, allowing a consistent sha256 hash every time a file is converted. See the version notes for all changes. Tweaked the title from 'files' to 'embeddings' to reflect the tool's limitations.
I wanted an easy way to convert .pt (PyTorch/PickleTensors) and .bin files for Textual Inversions and VAEs to the Safetensors format. DiffusionDalmation on GitHub has a Jupyter/Colab notebook (MIT license) that handled .pt files but not .bin files because of missing data in the .bin files that I had. Hugging Face has a function in the Safetensors repo (Apache license) that handles .bin, and probably .pt but I liked the training metadata details from the notebook version.
WARNING: code within files will be executed when the models are loaded - any malicious code will be executed, too. Do not run this on your own machine with untrusted/unscanned files containing pickle imports.
I started with pieces of both scripts and rewrote it into a script that will try to convert both types as individual files or a directory of files. The .safetensors gets a new file hash since it is a new file, but they are functionally identical from my testing. I have only tested on PyTorch 2 with SD1.5 TIs and VAEs though. It works on Windows with CPU or CUDA, but has theoretical support for the MacOS Metal backend and will fall back to using CPU. Buy me a Mac and I'll test it there. ;P
Assuming that you're in a trusted environment converting models you trust, you can activate an existing venv and run it from there, or set up a new venv with requirements.txt - the convert.bat script should handle this for you on Windows.
Using convert.bat:
V:\sd-info\safetensors-converter\convert.bat O:\embeddings
Or reusing an existing venv from the automatic1111 web UI and running the script directly:
V:\stable-diffusion-webui\venv\Scripts\activate
python V:\sd-info\safetensors-converter\bin-pt_to_safetensors.py O:\embeddings
deactivate
By default, the displayed metadata about the original file will be stored in a "<modelname>.metadata.txt" file alongside the .safetensors file. You can add --json and/or --html to the command to save it in those formats instead (may be helpful for things like the sd-model-preview-xd extension for automatic1111). You can also pass --skip-meta if you don't want the metadata saved at all.
You can pass '.' in as the <convert_path> value to convert anything in the current directory, or provide the full path to a file or directory. It now recurses through subdirectories and convert what it finds.
If you get an error on a specific file, it may just have the wrong extension, e.g. try renaming .bin to .pt or the other way around (my .pt VAEs needed to be named .bin - VAEs are barely tested on this, so YMMV).
Safetensors metadata will be added detailing the original format and shape of the tensor (vectors, dimensions).
The .pt conversion will display and save metadata about the training model, hash, and steps, when available.
The .bin conversion does not provide those extra details, the format seems to lack that data.
Post-conversion, the script will check file sizes and compare that the output tensors match the original. It will throw an error if the file has changed too much or there's a mismatch.
The script should halt whenever there's an error, and will overwrite any existing .safetensors files with the same base name as the original file.
Before/after file size and hashes for a few example TIs:
.:
total 0
drwxr-xr-x 1 user 0 Jun 2 13:29 negatives/
drwxr-xr-x 1 user 0 Jun 2 13:29 nobodies/
drwxr-xr-x 1 user 0 Jun 2 13:29 sliders/
./negatives:
total 457
-rw-r--r-- 1 user 231339 Apr 30 19:59 ng_deepnegative_v1_75t.pt
-rw-r--r-- 1 user 230488 Jun 2 13:29 ng_deepnegative_v1_75t.safetensors
./nobodies:
total 9
-rw-r--r-- 1 user 3931 May 16 06:21 LulaCipher.bin
-rw-r--r-- 1 user 3152 Jun 2 13:29 LulaCipher.safetensors
./sliders:
total 105
-rw-r--r-- 1 user 50036 May 20 10:35 AS-MidAged.pt
-rw-r--r-- 1 user 49232 Jun 2 13:29 AS-MidAged.safetensors
$ for i in */*.*; do sha256sum ${i}; done
54e7e4826d53949a3d0dde40aea023b1e456a618c608a7630e3999fd38f93245 *negatives/ng_deepnegative_v1_75t.pt
4fff59d544381804f989fa1db606dce90e1a31070fb8b74ee7238508ddc88bbb *negatives/ng_deepnegative_v1_75t.safetensors
433c565251ac13398000595032c436eb361634e80e581497d116f224083eb468 *nobodies/LulaCipher.bin
79850379fbb29ece0c3c2fef0e5e9a2dee02bd65827f7a0c6743c848560fb6ad *nobodies/LulaCipher.safetensors
d9a9546a597ad34497d4a5a24624478df056b5a9426a1934efdbfd65177b120d *sliders/AS-MidAged.pt
cd5bfdc84fe2e3730162b360fbf242167a37ec2a7876a61b3ab94906de7e79e4 *sliders/AS-MidAged.safetensors