Info

https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0


Test 1 - Different seed variations

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Parameters: Steps: 50| Size: 1024x1024| Seed: 1972235878| CFG scale: 6| App: SD.Next| Version: 7ccb9d3| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc

Time: 4m 15.88s | total 494.75 pipeline 249.02 preview 238.10 decode 5.90 move 0.69 prompt 0.61 gc 0.57 post 0.27 | GPU 9470 MB 7% | RAM 2.84 GB 2%

CFG 6, 50 STEPS

28998687402561095516397770093610997276091972235878

bookshop girl
768px

1024

face and hand

768px

legs and shoes

768px



Test 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Parameters: Steps: 32| Size: 768x768| Seed: 1972235878| CFG scale: 7| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc

Time: 1m 26.80s | total 165.44 pipeline 83.94 preview 78.63 decode 2.59 gc 0.46 | GPU 8872 MB 7% | RAM 2.84 GB 2%



816203250

CFG0
CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG7

CFG8

CFG9

CFG12







Test 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Parameters: Steps: 20| Size: 768x768| Seed: 1099727609| CFG scale: 6| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc


Time: 55.70s | total 60.21 pipeline 52.82 preview 4.50 decode 2.63 gc 0.46 | GPU 8878 MB 7% | RAM 2.84 GB 2%



816203250100

CFG0

CFG1

CFG2

CFG3

CFG4

CFG5

CFG6


CFG7

CFG8

CFG12


Test 3 - Legs

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Parameters: Steps: 16| Size: 768x768| Seed: 3977700936| CFG scale: 4| App: SD.Next| Version: 0b8001c| Pipeline: StableDiffusionXLPipeline| Operations: txt2img| Model: sd_xl_base_1.0| Model hash: 31e35c80fc


Time: 46.40s | total 87.23 pipeline 42.72 preview 39.91 decode 2.54 move 0.89 prompt 0.88 gc 0.50 | GPU 8878 MB 7% | RAM 2.86 GB 2%



16203250

CFG4

CFG6

CFG8

CFG10

CFG12

CFG14


System info


app: sdnext.git updated: 2025-07-25 hash: 0b8001c0 url: https://github.com/vladmandic/sdnext.git/tree/dev
arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-24-generic 
python: 3.12.3 Torch: 2.7.1+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex: 
ram: free:122.5 used:2.83 total:125.33
xformers:  diffusers: 0.35.0.dev0 transformers: 4.53.2
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: sd_xl_base_1.0 [31e35c80fc] refiner: none vae: none te: none unet: none


Config

{
  "samples_filename_pattern": "[seq]-[date]-[model_name]-[height]x[width]-STEP[steps]-CFG[cfg]-Seed[seed]",
  "diffusers_version": "1c50a5f7e0392281336e21bc3f74ba48f8819207",
  "sd_model_checkpoint": "sd_xl_base_1.0 [31e35c80fc]",
  "sd_checkpoint_hash": "31e35c80fc4829d14f90153f4c74cd59c90b779f6afe05a74cd6120b893f7e5b",
  "diffusers_to_gpu": true,
  "diffusers_offload_mode": "none",
  "civitai_token": "f1099bd3751c5985c20e4b25b79cba65"
}


Model Info

Model: sd_xl_base_1.0
Type: sdxl
Class: StableDiffusionXLPipeline
Size: 6 938 078 334 bytes
Modified: 2025-07-15 13:47:33


ModuleClassDeviceDTypeParamsModulesConfig

vae

AutoencoderKL

xpu:0

torch.bfloat16

83653863

243

FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 4, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.13025, 'shift_factor': None, 'latents_mean': None, 'latents_std': None, 'force_upcast': False, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, '_use_default_values': ['mid_block_add_attention', 'latents_std', 'use_quant_conv', 'use_post_quant_conv', 'latents_mean', 'shift_factor'], '_class_name': 'AutoencoderKL', '_diffusers_version': '0.20.0.dev0', '_name_or_path': '../sdxl-vae/'})

text_encoder

CLIPTextModel

xpu:0

torch.bfloat16

123060480

152

CLIPTextConfig { "architectures": [ "CLIPTextModel" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 77, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "projection_dim": 768, "torch_dtype": "float16", "transformers_version": "4.53.2", "vocab_size": 49408 }

text_encoder_2

CLIPTextModelWithProjection

xpu:0

torch.bfloat16

694659840

393

CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "gelu", "hidden_size": 1280, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 5120, "layer_norm_eps": 1e-05, "max_position_embeddings": 77, "model_type": "clip_text_model", "num_attention_heads": 20, "num_hidden_layers": 32, "pad_token_id": 1, "projection_dim": 1280, "torch_dtype": "float16", "transformers_version": "4.53.2", "vocab_size": 49408 }

tokenizer

CLIPTokenizer

None

None

0

0

None

tokenizer_2

CLIPTokenizer

None

None

0

0

None

unet

UNet2DConditionModel

xpu:0

torch.bfloat16

2567463684

1930

FrozenDict({'sample_size': 128, 'in_channels': 4, 'out_channels': 4, 'center_input_sample': False,

scheduler

EulerDiscreteScheduler

None

None

0

0

FrozenDict({'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'trained_betas': None, 'prediction_type': 'epsilon', 'interpolation_type': 'linear', 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'sigma_min': None, 'sigma_max': None, 'timestep_spacing': 'leading', 'timestep_type': 'discrete', 'steps_offset': 1, 'rescale_betas_zero_snr': False, 'final_sigmas_type': 'zero', '_use_default_values': ['use_exponential_sigmas', 'timestep_type', 'sigma_min', 'final_sigmas_type', 'use_beta_sigmas', 'sigma_max', 'rescale_betas_zero_snr'], '_class_name': 'EulerDiscreteScheduler', '_diffusers_version': '0.19.0.dev0', 'clip_sample': False, 'sample_max_value': 1.0, 'set_alpha_to_one': False, 'skip_prk_steps': True})

image_encoder

NoneType

None

None

0

0

None

feature_extractor

NoneType

None

None

0

0

None

force_zeros_for_empty_prompt

bool

None

None

0

0

None

_class_name

str

None

None

0

0

None

_diffusers_version

str

None

None

0

0

None


{
modelspec.sai_model_spec: "1.0.0",
modelspec.architecture: "stable-diffusion-xl-v1-base",
modelspec.implementation: "https://github.com/Stability-AI/generative-models",
modelspec.title: "Stable Diffusion XL 1.0 Base",
modelspec.author: "StabilityAI",
modelspec.description: "SDXL 1.0 Base Model, compositional expert. SDXL, the most advanced development in the Stable Diffusion text-to-image suite of models. SDXL produces massively improved image and composition detail over its predecessors. The ability to generate hyper-realistic creations for films, television, music, and instructional videos, as well as offering advancements for design and industrial use, places SDXL at the forefront of real world applications for AI imagery.",
modelspec.date: "2023-07-26",
modelspec.resolution: "1024x1024",
modelspec.prediction_type: "epsilon",
modelspec.license: "CreativeML Open RAIL++-M License",
modelspec.thumbnail: "data",
modelspec.hash_sha256: "0xd7a9105a900fd52748f20725fe52fe52b507fd36bee4fc107b1550a26e6ee1d7"
}


  • No labels