Info
https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
Test 1 - Different seed variations
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 50| Size: 1024x1024| Seed: 1972235878| CFG scale: 6| App: SD.Next| Version: 7ccb9d3| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc
Time: 4m 15.88s | total 494.75 pipeline 249.02 preview 238.10 decode 5.90 move 0.69 prompt 0.61 gc 0.57 post 0.27 | GPU 9470 MB 7% | RAM 2.84 GB 2%
CFG 6, 50 STEPS | 2899868740 | 2561095516 | 3977700936 | 1099727609 | 1972235878 |
|---|---|---|---|---|---|
bookshop girl | |||||
1024 | |||||
face and hand 768px | |||||
legs and shoes 768px |
Test 1 - Bookshop
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 32| Size: 768x768| Seed: 1972235878| CFG scale: 7| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc
Time: 1m 26.80s | total 165.44 pipeline 83.94 preview 78.63 decode 2.59 gc 0.46 | GPU 8872 MB 7% | RAM 2.84 GB 2%
| 8 | 16 | 20 | 32 | 50 | |
|---|---|---|---|---|---|
CFG0 | |||||
CFG2 | |||||
CFG3 | |||||
CFG4 | |||||
CFG5 | |||||
CFG6 | |||||
CFG7 | |||||
CFG8 | |||||
CFG9 | |||||
CFG12 |
Test 2 - Face and hand
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 20| Size: 768x768| Seed: 1099727609| CFG scale: 6| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc
Time: 55.70s | total 60.21 pipeline 52.82 preview 4.50 decode 2.63 gc 0.46 | GPU 8878 MB 7% | RAM 2.84 GB 2%
| 8 | 16 | 20 | 32 | 50 | 100 | |
|---|---|---|---|---|---|---|
CFG0 CFG1 | ||||||
CFG2 | ||||||
CFG3 | ||||||
CFG4 | ||||||
CFG5 | ||||||
CFG6 | ||||||
CFG7 | ||||||
CFG8 | ||||||
CFG12 |
Test 3 - Legs
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
Parameters: Steps: 16| Size: 768x768| Seed: 3977700936| CFG scale: 4| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc
Time: 46.40s | total 87.23 pipeline 42.72 preview 39.91 decode 2.54 move 0.89 prompt 0.88 gc 0.50 | GPU 8878 MB 7% | RAM 2.86 GB 2%
| 16 | 20 | 32 | 50 | |
|---|---|---|---|---|
CFG4 | ||||
CFG6 | ||||
CFG8 | ||||
CFG10 | ||||
CFG12 | ||||
CFG14 |
System info
app: sdnext.git updated: 2025-07-25 hash: 0b8001c0 url: https://github.com/vladmandic/sdnext.git/tree/dev arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-24-generic python: 3.12.3 Torch: 2.7.1+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: ram: free:122.5 used:2.83 total:125.33 xformers: diffusers: 0.35.0.dev0 transformers: 4.53.2 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: sd_xl_base_1.0 [31e35c80fc] refiner: none vae: none te: none unet: none
Config
{
"samples_filename_pattern": "[seq]-[date]-[model_name]-[height]x[width]-STEP[steps]-CFG[cfg]-Seed[seed]",
"diffusers_version": "1c50a5f7e0392281336e21bc3f74ba48f8819207",
"sd_model_checkpoint": "sd_xl_base_1.0 [31e35c80fc]",
"sd_checkpoint_hash": "31e35c80fc4829d14f90153f4c74cd59c90b779f6afe05a74cd6120b893f7e5b",
"diffusers_to_gpu": true,
"diffusers_offload_mode": "none",
"civitai_token": "f1099bd3751c5985c20e4b25b79cba65"
}
Model Info
Model: sd_xl_base_1.0 Type: sdxl Class: StableDiffusionXLPipeline Size: 6 938 078 334 bytes Modified: 2025-07-15 13:47:33
| Module | Class | Device | DType | Params | Modules | Config |
|---|---|---|---|---|---|---|
vae | AutoencoderKL | xpu:0 | torch.bfloat16 | 83653863 | 243 | FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 4, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.13025, 'shift_factor': None, 'latents_mean': None, 'latents_std': None, 'force_upcast': False, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, '_use_default_values': ['mid_block_add_attention', 'latents_std', 'use_quant_conv', 'use_post_quant_conv', 'latents_mean', 'shift_factor'], '_class_name': 'AutoencoderKL', '_diffusers_version': '0.20.0.dev0', '_name_or_path': '../sdxl-vae/'}) |
text_encoder | CLIPTextModel | xpu:0 | torch.bfloat16 | 123060480 | 152 | CLIPTextConfig { "architectures": [ "CLIPTextModel" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 77, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "projection_dim": 768, "torch_dtype": "float16", "transformers_version": "4.53.2", "vocab_size": 49408 } |
text_encoder_2 | CLIPTextModelWithProjection | xpu:0 | torch.bfloat16 | 694659840 | 393 | CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "gelu", "hidden_size": 1280, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 5120, "layer_norm_eps": 1e-05, "max_position_embeddings": 77, "model_type": "clip_text_model", "num_attention_heads": 20, "num_hidden_layers": 32, "pad_token_id": 1, "projection_dim": 1280, "torch_dtype": "float16", "transformers_version": "4.53.2", "vocab_size": 49408 } |
tokenizer | CLIPTokenizer | None | None | 0 | 0 | None |
tokenizer_2 | CLIPTokenizer | None | None | 0 | 0 | None |
unet | UNet2DConditionModel | xpu:0 | torch.bfloat16 | 2567463684 | 1930 | FrozenDict({'sample_size': 128, 'in_channels': 4, 'out_channels': 4, 'center_input_sample': False, |
scheduler | EulerDiscreteScheduler | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'trained_betas': None, 'prediction_type': 'epsilon', 'interpolation_type': 'linear', 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'sigma_min': None, 'sigma_max': None, 'timestep_spacing': 'leading', 'timestep_type': 'discrete', 'steps_offset': 1, 'rescale_betas_zero_snr': False, 'final_sigmas_type': 'zero', '_use_default_values': ['use_exponential_sigmas', 'timestep_type', 'sigma_min', 'final_sigmas_type', 'use_beta_sigmas', 'sigma_max', 'rescale_betas_zero_snr'], '_class_name': 'EulerDiscreteScheduler', '_diffusers_version': '0.19.0.dev0', 'clip_sample': False, 'sample_max_value': 1.0, 'set_alpha_to_one': False, 'skip_prk_steps': True}) |
image_encoder | NoneType | None | None | 0 | 0 | None |
feature_extractor | NoneType | None | None | 0 | 0 | None |
force_zeros_for_empty_prompt | bool | None | None | 0 | 0 | None |
_class_name | str | None | None | 0 | 0 | None |
_diffusers_version | str | None | None | 0 | 0 | None |
{
modelspec.sai_model_spec: "1.0.0",
modelspec.architecture: "stable-diffusion-xl-v1-base",
modelspec.implementation: "https://github.com/Stability-AI/generative-models",
modelspec.title: "Stable Diffusion XL 1.0 Base",
modelspec.author: "StabilityAI",
modelspec.description: "SDXL 1.0 Base Model, compositional expert. SDXL, the most advanced development in the Stable Diffusion text-to-image suite of models. SDXL produces massively improved image and composition detail over its predecessors. The ability to generate hyper-realistic creations for films, television, music, and instructional videos, as well as offering advancements for design and industrial use, places SDXL at the forefront of real world applications for AI imagery.",
modelspec.date: "2023-07-26",
modelspec.resolution: "1024x1024",
modelspec.prediction_type: "epsilon",
modelspec.license: "CreativeML Open RAIL++-M License",
modelspec.thumbnail: "data",
modelspec.hash_sha256: "0xd7a9105a900fd52748f20725fe52fe52b507fd36bee4fc107b1550a26e6ee1d7"
}