...
info from wiki https://en.wikipedia.org/wiki/Stable_Diffusion
| Version number | Release date | Parameters | Notes |
|---|
| 1.1, 1.2, 1.3, 1.4 | August 2022 |
| All released by CompVis. There is no "version 1.0". 1.1 gave rise to 1.2, and 1.2 gave rise to both 1.3 and 1.4. |
| 1.5 | October 2022 | 983M | Initialized with the weights of 1.2, not 1.4. Released by RunwayML. |
| 2.0 | November 2022 |
| Retrained from scratch on a filtered dataset. |
| 2.1 | December 2022 |
| Initialized with the weights of 2.0. |
| XL 1.0 | July 2023 | 3.5B | The XL 1.0 base model has 3.5 billion parameters, making it around 3.5x larger than previous versions. |
| XL Turbo | November 2023 |
| Distilled from XL 1.0 to run in fewer diffusion steps. |
| 3.0 | February 2024 (early preview) | 800M to 8B | A family of models. |
| 3.5 | October 2024 | 2.5B to 8B | A family of models with Large (8 billion parameters), Large Turbo (distilled from SD 3.5 Large), and Medium (2.5 billion parameters). |
SD.Next Supported Models
from https://vladmandic.github.io/sdnext-docs/Models/Sorted by model parameters
| Publisher | Model | Version | Size | Diffusion Architecture | Model Params | Text Encoder(s) | TE Params | Auto Encoder |
Other| HiDream-AI | StabilityAI | Stable Diffusion | 1.5 | 2.28GB | UNet | 0.86BHiDream | I1 Fast/Dev/Full | 42.71 GB + 15.69 | MMDiT | 17.10B | CLiP ViT-L | 0.12B | VAE | StabilityAI | Stable Diffusion | 2.1 | 2.58GB | UNet | 0.86B | CLiP ViT-H | 0.34B | + ViT+G + T5-XXL + LLama-3.1-8B | 0.12B + 0.69B + 2.95B + 4.54B | 16ch VAE |
| Black Forest Labs | Flux | 1 Dev/Schnell | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | 16ch VAE |
| StabilityAI | Stable Diffusion | XL | 6.94GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE | 3.5 Large | 26.98GB | MMDiT | 8.05B | StabilityAI | Stable Diffusion | 3.0 Medium | 15.14GB | MMDiT | 2.0B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE | StabilityAI
| FAL | Stable DiffusionAuraFlow | 0.3 | .5 Medium | 1531. | 89GB90GB | MMDiT | 26. | 25BCLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 8B | UMT5 | 12.1B | VAE |
| Thudm | CogView | 4 | 30.39GB | DiT | 6.37B | GLM-4 | 9.40B | 16ch VAE |
| StabilityAI | Stable | Diffusion3.5 Large | 26.98GB | MMDiT | 8.05BCascade | Medium | 11.82GB | Multi-stage UNet | 1.56B + 3.6B | CLiP ViT- | L + ViT+G + T5-XXLG | 0. | 12B + 0.69B + 4.76B16ch VAE | StabilityAI | Stable Cascade | Medium | 11.82GB | Multi-stage UNet | 1.56B + 3.6B | CLiP ViT-G | 0.69B | 42x VQE | StabilityAI | Stable Cascade | Lite | 4.97GB | Multi-stage UNet | 0.7B + 1.0B | CLiP ViT-G | 0.69B | 42x VQE | Black Forest Labs | Flux | 1 Dev/Schnell | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | 16ch VAE | Ostris | Flex | 1 Alpha | 25.65GB | MMDiT | 4.0B | CLiP ViT-L + T5-XXL | 0.12B + 2.95B | 16ch VAE | NVLabs | Sana | 1.5 1.6B | 9.49GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE | NVLabs | Sana | 1.5 4.8B | 15.58GB | MMDiT | 4.72B | Gemma2 | 2.61B | DC-AE | NVLabs | Sana | 1.0 1600M | 12.63GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE | NVLabs | Sana | 1.0 600M | 7.51GB | MMDiT | 0.59B | Gemma2 | 2.61B | DC-AE | FAL | AuraFlow | 0.3 | 31.90GB | MMDiT | 6.8B | UMT5 | 12.1B | VAE | AlphaVLLM | Lumina | Next SFT | 8.67GB | DiT | 1.7B | Gemma | 2.5B | VAE | AlphaVLLM | Lumina | 2 | 20.75GB | DiT | 2.61B | Gemma-2 | 2.61B | 16ch VAE | PixArt | Alpha | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE | PixArt | Sigma | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE | Segmind | SSD-1B | N/A | 8.72GB | UNet | 1.33B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE | Segmind | Vega | N/A | 6.43GB | UNet | 0.75B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE | Segmind | Tiny | N/A | 1.03GB | UNet | 0.32B | CLiP ViT-L | 0.12B | VAE | Kwai | Kolors | N/A | 17.40GB | UNnet | 2.58B | ChatGLM | 6.24B | VAE | PlaygroundAI | Playground | 1.0 | 4.95GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE | PlaygroundAI | Playground | 2.x | 13.35GB | 69B | 42x VQE |
| NVLabs | Sana | 1.5 4.8B | 15.58GB | MMDiT | 4.72B | Gemma2 | 2.61B | DC-AE |
| Ostris | Flex | 1 Alpha | 25.65GB | MMDiT | 4.0B | CLiP ViT-L + T5-XXL | 0.12B + 2.95B | 16ch VAE |
| VectorSpaceLab | OmniGen | v1 | 15.47GB | Transformer | 3.76B | None | 0 | VAE |
| Kandinsky | Kandinsky | 3 | 27.72GB | Unet | 3.05B | T5-XXXL | 8.72B | VQ |
| Thudm | CogView | 3 Plus | 24.96GB | DiT | 2.85B | T5-XXL | 4.76B | VAE |
| AlphaVLLM | Lumina | 2 | 20.75GB | DiT | 2.61B | Gemma-2 | 2.61B | 16ch VAE |
| Kwai | Kolors | N/A | 17.40GB | UNnet | 2.58B | ChatGLM | 6.24B | VAE |
| StabilityAI | Stable Diffusion | XL | 6.94GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
| PlaygroundAI | Playground | 2.x | 13.35GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
| StabilityAI | Stable Diffusion | 3.5 Medium | 15.89GB | MMDiT | 2.25B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE |
| Warp AI | Wuerstchen | N/A | 12.16GB | Multi-stage UNet | 1.0B + 1.05B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | 42x VQE |
| StabilityAI | Stable Diffusion | 3.0 Medium | 15.14GB | MMDiT | 2.0B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE |
| StabilityAI | Stable Cascade | Lite | 4.97GB | Multi-stage UNet | 0.7B + 1.0B | CLiP ViT-G | 0.69B | 42x VQE |
| AlphaVLLM | Lumina | Next SFT | 8.67GB | DiT | 1.7B | Gemma | 2.5B | VAE |
| NVLabs | Sana | 1.5 1.6B | 9.49GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE |
| NVLabs | Sana | 1.0 1600M | 12.63GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE |
| DeepFloyd | IF | L | 15.48GB | Multi-stage UNet | 0.61B + 0.93B | T5-XXL | 4.76B | Pixel |
| Tencent | HunyuanDiT | 1.2 | 14.09GB | DiT | 1.5B | BERT + T5-XL | 3.52B + 1.67B | VAE |
| Segmind | SSD-1B | N/A | 8.72GB | UNet | 1.33B | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
Tencent| Kandinsky | HunyuanDiTKandinsky | 12.2 | 145. | 09GB15GB | DiTUnet | 1. | 5BBERT + T5-XL | 3.52B + 1.67B | VAE | Warp AI | 25B | CLiP ViT-G | 0.69B | VQ |
| MeissonFlow | Meissonic | Wuerstchen | N/A | 123. | 16GBMulti-stage UNet64GB | DiT | 1. | 0B + 1.05B18B | CLiP ViT- | L + ViT+GH | 0. | 12B + 0.69B42x VQE | Kandinsky | Kandinsky | 35B | VQ |
| Thu-ML | UniDiffuser | v1 | 2.2 | 5. | 15GB37GB | UnetU-ViT | 10. | 25B95B | CLiP ViT- | GL + CLiP ViT-B | 0. | 69BVQ | Kandinsky | Kandinsky | 3.0 | 27.72GB | Unet | 3.05B | T5-XXXL | 8.72B | VQ | Thudm | CogView | 3 Plus | 24.96GB | DiT | 2.85B | T5-XXL | 4.76B | VAE | Thudm | CogView | 4 | 30.39GB | DiT | 6.37B | GLM-4 | 9.40B | VAE | IDKiro | SDXS | N/A | 2.05GB | UNet | 0.32B | CLiP ViT-L | 0.12B | VAE | Open-MUSE | aMUSEd | 256 | 3.41GB | ViT | 0.60B | CLiP ViT-L | 0.12B | VQ12B + 0.16B | VAE |
| StabilityAI | Stable Diffusion | 2.1 | 2.58GB | UNet | 0.86B | CLiP ViT-H | 0.34B | VAE |
| StabilityAI | Stable Diffusion | 1.5 | 2.28GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE |
| PlaygroundAI | Playground | 1 | 4.95GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE |
| Salesforce | BLIP-Diffusion | N/A | 7.23GB | UNet | 0.86B | CLiP ViT-L + BLiP-2 | 0.12B + 0.49B | VAE |
| DeepFloyd | IF | M | 12.79GB | Multi-stage UNet | 0.37B + 0.46B | T5-XXL | 4.76B | Pixel |
| Koala | Koala | 700M | 6.58GB | UNet | 0.78B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAEThu-ML |
| Segmind | UniDiffuserVega | v1N/A | 56. | 37GB43GB | U-ViTUNet | 0. | 95B75B | CLiP ViT-L + | CLiP ViT | -B+G | 0.12B + 0. | 16B69B | VAE |
Salesforce| PixArt | BLIP-Diffusion | N/A | 7.23GB | UNetAlpha | XL 2 | 21.3GB | DiT | 0. | 86B61B | CLiP ViT-L + BLiP-2 | 0.12B + 0.49BT5-XXL | 4.76B | VAE |
DeepFloyd| PixArt | IFSigma | MXL 2 | 1221. | 79GBMulti-stage UNet3GB | DiT | 0. | 37B + 0.46B61B | T5-XXL | 4.76B | PixelVAE | DeepFloyd
| Open-MUSE | IFaMUSEd | L256 | 153. | 48GBMulti-stage UNet | 0.61B + 0.93B | T5-XXL | 4.76B | Pixel | MeissonFlow | 41GB | ViT | 0.60B | CLiP ViT-L | 0.12B | VQ |
| NVLabs | Sana | 1.0 600M | 7.51GB | MMDiT | 0.59B | Gemma2 | 2.61B | DC-AE |
| Segmind | Tiny | Meissonic | N/A | 31. | 64GB03GB | DiTUNet | 10. | 18B32B | CLiP ViT- | HL | 0. | 35B12B | VQVAE |
| VectorSpaceLabIDKiro | OmniGenSDXSv1 | N/A | 152.47GB05GB | TransformerUNet | 30.76B32B | CLiP ViT-LNone | 0.12B | VAE | Phi-3 | HiDream-AI | HiDream | I2 Fast/Dev/Full | 42.71 GB + 15.69 | MMDiT | 17.10B | CLiP
from https://vladmandic.github.io/sdnext-docs/Models/
| Publisher | Model | Version | Size | Diffusion Architecture | Model Params | Text Encoder(s) | TE Params | Auto Encoder | Other |
|---|
| StabilityAI | Stable Diffusion | 3.5 Large | 26.98GB | MMDiT | 8.05B | CLiP ViT-L + ViT+G + T5-XXL |
+ LLama-3.1-8B + 2.95B 54B| 76B | 16ch VAE |
|
| StabilityAI | Stable Diffusion | 3.5 Medium | 15.89GB | MMDiT | 2.25B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE |
|
| StabilityAI | Stable Diffusion | 3.0 Medium | 15.14GB | MMDiT | 2.0B | CLiP ViT-L + ViT+G + T5-XXL | 0.12B + 0.69B + 4.76B | 16ch VAE |
|
| StabilityAI | Stable Cascade | Medium | 11.82GB | Multi-stage UNet | 1.56B + 3.6B | CLiP ViT-G | 0.69B | 42x VQE |
|
| StabilityAI | Stable Cascade | Lite | 4.97GB | Multi-stage UNet | 0.7B + 1.0B | CLiP ViT-G | 0.69B | 42x VQE |
|
| StabilityAI | Stable Diffusion | XL | 6.94GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
|
| StabilityAI | Stable Diffusion | 2.1 | 2.58GB | UNet | 0.86B | CLiP ViT-H | 0.34B | VAE |
|
| StabilityAI | Stable Diffusion | 1.5 | 2.28GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE |
|
| HiDream-AI | HiDream | I1 Fast/Dev/Full | 42.71 GB + 15.69 | MMDiT | 17.10B | CLiP ViT-L + ViT+G + T5-XXL + LLama-3.1-8B | 0.12B + 0.69B + 2.95B + 4.54B | 16ch VAE |
|
| Black Forest Labs | Flux | 1 Dev/Schnell | 32.93GB | MMDiT | 11.9B | CLiP ViT-L + T5-XXL | 0.12B + 4.76B | 16ch VAE |
|
| Ostris | Flex | 1 Alpha | 25.65GB | MMDiT | 4.0B | CLiP ViT-L + T5-XXL | 0.12B + 2.95B | 16ch VAE |
|
| NVLabs | Sana | 1.5 4.8B | 15.58GB | MMDiT | 4.72B | Gemma2 | 2.61B | DC-AE |
|
| NVLabs | Sana | 1.5 1.6B | 9.49GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE |
|
| NVLabs | Sana | 1.0 1600M | 12.63GB | MMDiT | 1.60B | Gemma2 | 2.61B | DC-AE |
|
| NVLabs | Sana | 1.0 600M | 7.51GB | MMDiT | 0.59B | Gemma2 | 2.61B | DC-AE |
|
| FAL | AuraFlow | 0.3 | 31.90GB | MMDiT | 6.8B | UMT5 | 12.1B | VAE |
|
| AlphaVLLM | Lumina | Next SFT | 8.67GB | DiT | 1.7B | Gemma | 2.5B | VAE |
|
| AlphaVLLM | Lumina | 2 | 20.75GB | DiT | 2.61B | Gemma-2 | 2.61B | 16ch VAE |
|
| PixArt | Alpha | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE |
|
| PixArt | Sigma | XL 2 | 21.3GB | DiT | 0.61B | T5-XXL | 4.76B | VAE |
|
| Segmind | SSD-1B | N/A | 8.72GB | UNet | 1.33B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
|
| Segmind | Vega | N/A | 6.43GB | UNet | 0.75B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
|
| Segmind | Tiny | N/A | 1.03GB | UNet | 0.32B | CLiP ViT-L | 0.12B | VAE |
|
| Kwai | Kolors | N/A | 17.40GB | UNnet | 2.58B | ChatGLM | 6.24B | VAE |
|
| PlaygroundAI | Playground | 1.0 | 4.95GB | UNet | 0.86B | CLiP ViT-L | 0.12B | VAE |
|
| PlaygroundAI | Playground | 2.x | 13.35GB | UNet | 2.56B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
|
| Tencent | HunyuanDiT | 1.2 | 14.09GB | DiT | 1.5B | BERT + T5-XL | 3.52B + 1.67B | VAE |
|
| Warp AI | Wuerstchen | N/A | 12.16GB | Multi-stage UNet | 1.0B + 1.05B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | 42x VQE |
|
| Kandinsky | Kandinsky | 2.2 | 5.15GB | Unet | 1.25B | CLiP ViT-G | 0.69B | VQ |
|
| Kandinsky | Kandinsky | 3.0 | 27.72GB | Unet | 3.05B | T5-XXXL | 8.72B | VQ |
|
| Thudm | CogView | 3 Plus | 24.96GB | DiT | 2.85B | T5-XXL | 4.76B | VAE |
|
| Thudm | CogView | 4 | 30.39GB | DiT | 6.37B | GLM-4 | 9.40B | VAE |
|
| IDKiro | SDXS | N/A | 2.05GB | UNet | 0.32B | CLiP ViT-L | 0.12B | VAE |
|
| Open-MUSE | aMUSEd | 256 | 3.41GB | ViT | 0.60B | CLiP ViT-L | 0.12B | VQ |
|
| Koala | Koala | 700M | 6.58GB | UNet | 0.78B | CLiP ViT-L + ViT+G | 0.12B + 0.69B | VAE |
|
| Thu-ML | UniDiffuser | v1 | 5.37GB | U-ViT | 0.95B | CLiP ViT-L + CLiP ViT-B | 0.12B + 0.16B | VAE |
|
| Salesforce | BLIP-Diffusion | N/A | 7.23GB | UNet | 0.86B | CLiP ViT-L + BLiP-2 | 0.12B + 0.49B | VAE |
|
| DeepFloyd | IF | M | 12.79GB | Multi-stage UNet | 0.37B + 0.46B | T5-XXL | 4.76B | Pixel |
|
| DeepFloyd | IF | L | 15.48GB | Multi-stage UNet | 0.61B + 0.93B | T5-XXL | 4.76B | Pixel |
|
| MeissonFlow | Meissonic | N/A | 3.64GB | DiT | 1.18B | CLiP ViT-H | 0.35B | VQ |
|
| VectorSpaceLab | OmniGen | v1 | 15.47GB | Transformer | 3.76B | None | 0 | VAE | Phi-3 |
16ch VAESize of the model on disk
...