Intro

App: https://github.com/vladmandic/sdnext/tree/dev Version 2025-07-040 (ipex)

Model: https://huggingface.co/Kwai-Kolors/Kolors

HW: Intel core i7 1355U Intel Xe Graphics iGPU, 96GB DDR5 5600 CL46 RAM

Part 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Execution: Time: 15m 5.29s | total 906.63 pipeline 869.25 decode 36.00 preview 1.34 | RAM 44.6 GB 47%

	STEPS: 4	STEPS: 8	STEPS: 16	STEPS: 20	STEPS: 32
CFG0
CFG1
CFG2
CFG3
CFG4
CFG5
CFG6
CFG7
CFG8
CFG9

Part 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Execution: Time: 18m 39.49s | total 1120.96 pipeline 1083.47 decode 35.96 preview 1.47 | RAM 61.37 GB 65%

	12	16	20	24
CFG=1
CFG=2
CFG=3
CFG=5
CFG=8

Part 3 - Legs and ribbon

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

	12	16	20	24
CFG=1
CFG=2
CFG=3
CFG=5
CFG=8

Is there way to draw legs correct?

trying several random seeds - may get a better result
trying different resolutions
trying different sampler
trying negative prompts

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

trying different resolutions and samplers

Seed 101684518 found to be better than 3033194654 just by run several generation with random seeds

	Steps: 12\| Size: 1024x1024\| Seed: 101684518\| CFG scale: 3	Steps: 12\| Size: 768x768\| Seed: 101684518\| CFG scale: 3	Steps: 12\| Size: 512x512\| Seed: 101684518\| CFG scale: 3
Default (Euler)	Time: 11m 22.99s total 683.75 pipeline 645.62 decode 37.33 preview 0.76 RAM 41.54 GB 44%	Time: 6m 38.81s total 399.49 pipeline 378.25 decode 20.54 preview 0.68 RAM 41.79 GB 44%
Heun			Time: 5m 53.04s total 353.95 pipeline 343.03 decode 9.98 preview 0.91 RAM 41.76 GB 44%
LCM			Time: 3m 17.69s total 198.13 pipeline 187.76 decode 9.89 preview 0.44 RAM 41.76 GB 44%
DDIM			Time: 3m 17.08s total 197.57 pipeline 187.34 decode 9.71 preview 0.49 RAM 41.76 GB 44%
DPM++ 2M			Time: 3m 13.02s total 193.43 pipeline 183.74 decode 9.26 preview 0.41 RAM 41.75 GB 44%
DPM++ 1S			Time: 3m 15.26s total 195.58 pipeline 185.46 decode 9.76 preview 0.32 RAM 41.78 GB 44%
DPM++ 2M SDE			Time: 3m 18.49s total 198.93 pipeline 188.45 decode 10.00 preview 0.44 RAM 41.79 GB 44%
KDPM2			Time: 5m 55.50s total 356.34 pipeline 345.08 decode 9.76 preview 0.85 prepare 0.63 RAM 41.76 GB 44%

System Info

app: sdnext updated: 2025-07-04 hash: 1a3b6e3b url: https://github.com/vladmandic/sdnext/tree/dev
arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic
python: 3.12.3 Torch 2.7.1+xpu
device: Intel(R) Iris(R) Xe Graphics (iGPU) openvino: 2025.2.0
ram: free:31.06 used:62.91 total:93.97 gpu: total:93.97
xformers: diffusers: 0.35.0.dev0 transformers: 4.53.0
active: cpu dtype: torch.float32 vae: torch.float32 unet: torch.float32
base: Diffusers/Kwai-Kolors/Kolors-diffusers [7e091c7519] refiner: none vae: none te: none unet: none

Model Data

Model: Diffusers/Kwai-Kolors/Kolors-diffusers
Type: KolorsPipeline
Class: KolorsPipeline
Size: 0 bytes
Modified: 2025-06-26 22:08:35

SD.Next dev 2025-06-29

Module	Class	Device	DType	Params	Modules	Config
vae	AutoencoderKL	cpu	torch.float32	83653863	243	FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 4, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.13025, 'shift_factor': None, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, '_use_default_values': ['mid_block_add_attention', 'latents_mean', 'use_quant_conv', 'use_post_quant_conv', 'latents_std', 'shift_factor', 'force_upcast'], '_class_name': 'AutoencoderKL', '_diffusers_version': '0.18.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/vae'})
text_encoder	ChatGLMModel	cpu	torch.float32	6243584000	316	ChatGLMConfig { "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification" }, "bias_dropout_fusion": true, "classifier_dropout": null, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float32", "transformers_version": "4.53.0", "use_cache": true, "vocab_size": 65024 }
tokenizer	ChatGLMTokenizer	None	None	0	0	None
unet	UNet2DConditionModel	cpu	torch.float32	2579458820	1931	FrozenDict({'sample_size': 128, 'in_channels': 4, 'out_channels': 4, 'center_input_sample': False, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['DownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'up_block_types': ['CrossAttnUpBlock2D', 'CrossAttnUpBlock2D', 'UpBlock2D'], 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'dropout': 0.0, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 2048, 'transformer_layers_per_block': [1, 2, 10], 'reverse_transformer_layers_per_block': None, 'encoder_hid_dim': 4096, 'encoder_hid_dim_type': 'text_proj', 'attention_head_dim': [5, 10, 20], 'num_attention_heads': None, 'dual_cross_attention': False, 'use_linear_projection': True, 'class_embed_type': None, 'addition_embed_type': 'text_time', 'addition_time_embed_dim': 256, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'resnet_skip_time_act': False, 'resnet_out_scale_factor': 1.0, 'time_embedding_type': 'positional', 'time_embedding_dim': None, 'time_embedding_act_fn': None, 'timestep_post_act': None, 'time_cond_proj_dim': None, 'conv_in_kernel': 3, 'conv_out_kernel': 3, 'projection_class_embeddings_input_dim': 5632, 'attention_type': 'default', 'class_embeddings_concat': False, 'mid_block_only_cross_attention': None, 'cross_attention_norm': None, 'addition_embed_type_num_heads': 64, '_class_name': 'UNet2DConditionModel', '_diffusers_version': '0.27.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/unet'})
scheduler	EulerDiscreteScheduler	None	None	0	0	FrozenDict({'num_train_timesteps': 1100, 'beta_start': 0.00085, 'beta_end': 0.014, 'beta_schedule': 'scaled_linear', 'trained_betas': None, 'prediction_type': 'epsilon', 'interpolation_type': 'linear', 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'sigma_min': None, 'sigma_max': None, 'timestep_spacing': 'leading', 'timestep_type': 'discrete', 'steps_offset': 1, 'rescale_betas_zero_snr': False, 'final_sigmas_type': 'zero', '_use_default_values': ['use_exponential_sigmas', 'final_sigmas_type', 'timestep_type', 'sigma_min', 'sigma_max', 'use_beta_sigmas'], '_class_name': 'EulerDiscreteScheduler', '_diffusers_version': '0.18.0.dev0', 'clip_sample': False, 'clip_sample_range': 1.0, 'dynamic_thresholding_ratio': 0.995, 'sample_max_value': 1.0, 'set_alpha_to_one': False, 'skip_prk_steps': True, 'thresholding': False})
image_encoder	NoneType	None	None	0	0	None
feature_extractor	NoneType	None	None	0	0	None
force_zeros_for_empty_prompt	bool	None	None	0	0	None
_name_or_path	str	None	None	0	0	None
_class_name	str	None	None	0	0	None
_diffusers_version	str	None	None	0	0	None

Page tree

Test 20 - Kolors - steps and guidance