Intro

App: https://github.com/vladmandic/sdnext/tree/dev Version 2025-07-040 (ipex)

Model: https://huggingface.co/Kwai-Kolors/Kolors

HW: Intel core i7 1355U Intel Xe Graphics iGPU, 96GB DDR5 5600 CL46 RAM

Part 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Parameters: Steps: 16| Size: 1024x1024| Seed: 3033194654| CFG scale: 6| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline

Execution: Time: 15m 5.29s | total 906.63 pipeline 869.25 decode 36.00 preview 1.34 | RAM 44.6 GB 47%



STEPS: 4STEPS: 8STEPS: 16STEPS: 20STEPS: 32
CFG0

CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG7

CFG8

CFG9

Part 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Parameters: Steps: 20| Size: 1024x1024| Seed: 3317287141| CFG scale: 6| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline
processing | 12.1/60.8s

Execution: Time: 18m 39.49s | total 1120.96 pipeline 1083.47 decode 35.96 preview 1.47 | RAM 61.37 GB 65%



12162024
CFG=1

CFG=2

CFG=3

CFG=5

CFG=8

Part 3 - Legs and ribbon

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Parameters: Steps: 20| Size: 1024x1024| Seed: 3033194654| CFG scale: 8| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline



12162024
CFG=1

CFG=2

CFG=3

CFG=5

CFG=8

Is there way to draw legs correct?

  1. trying several random seeds - may get a better result
  2. trying different resolutions
  3. trying different sampler
  4. trying negative prompts

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

trying different resolutions and samplers

Seed 101684518 found to be better than 3033194654 just by run several generation with random seedsĀ 


Steps: 12| Size: 1024x1024| Seed: 101684518| CFG scale: 3Steps: 12| Size: 768x768| Seed: 101684518| CFG scale: 3Steps: 12| Size: 512x512| Seed: 101684518| CFG scale: 3

Default

(Euler)

Time: 11m 22.99s
total 683.75 pipeline 645.62 decode 37.33 preview 0.76
RAM 41.54 GB 44%

Time: 6m 38.81s
total 399.49 pipeline 378.25 decode 20.54 preview 0.68
RAM 41.79 GB 44%

Heun

Time: 5m 53.04s
total 353.95 pipeline 343.03 decode 9.98 preview 0.91
RAM 41.76 GB 44%

LCM

Time: 3m 17.69s
total 198.13 pipeline 187.76 decode 9.89 preview 0.44
RAM 41.76 GB 44%

DDIM

Time: 3m 17.08s
total 197.57 pipeline 187.34 decode 9.71 preview 0.49
RAM 41.76 GB 44%

DPM++ 2M

Time: 3m 13.02s
total 193.43 pipeline 183.74 decode 9.26 preview 0.41
RAM 41.75 GB 44%

DPM++ 1S

Time: 3m 15.26s
total 195.58 pipeline 185.46 decode 9.76 preview 0.32
RAM 41.78 GB 44%

DPM++ 2M SDE

Time: 3m 18.49s
total 198.93 pipeline 188.45 decode 10.00 preview 0.44
RAM 41.79 GB 44%

KDPM2

Time: 5m 55.50s
total 356.34 pipeline 345.08 decode 9.76 preview 0.85 prepare 0.63
RAM 41.76 GB 44%

Negative prompt

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Negative: ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, ugly, blurry, bad anatomy, bad proportions, extra limbs, cloned face, out of frame, ugly, extra limbs, bad anatomy, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, mutated hands, fused fingers, too many fingers, long neck, extra head, cloned head, extra body, cloned body, watermark. extra hands, clone hands, weird hand, weird finger, weird arm, (mutation:1.3), (deformed:1.3), (blurry), (bad anatomy:1.1), (bad proportions:1.2), out of frame, ugly, (long neck:1.2), (worst quality:1.4), (low quality:1.4), (monochrome:1.1), text, signature, watermark, bad anatomy, disfigured, jpeg artifacts, 3d max, grotesque, desaturated, blur, haze, polysyndactyly

Parameters: Steps: 12| Size: 1024x1024| Seed: 101684518| CFG scale: 3| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline

Steps: 12| Size: 1024x1024| Seed: 101684518| CFG scale: 3+ Negative prompt

Trying more steps

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Parameters: Steps: 100| Size: 1024x1024| Sampler: DPM++ 2M SDE| Seed: 101684518| CFG scale: 3| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline

Execution: Time: 94m 49.55s | total 5692.86 pipeline 5648.46 decode 41.05 preview 3.31 | RAM 42.46 GB 45%

Steps: 12| Size: 1024x1024| Seed: 101684518| DPM++ 2M SDESteps: 24Steps: 48Steps: 100

System Info

app: sdnext updated: 2025-07-04 hash: 1a3b6e3b url: https://github.com/vladmandic/sdnext/tree/dev
arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic
python: 3.12.3 Torch 2.7.1+xpu
device: Intel(R) Iris(R) Xe Graphics (iGPU) openvino: 2025.2.0
ram: free:31.06 used:62.91 total:93.97 gpu: total:93.97
xformers: diffusers: 0.35.0.dev0 transformers: 4.53.0
active: cpu dtype: torch.float32 vae: torch.float32 unet: torch.float32
base: Diffusers/Kwai-Kolors/Kolors-diffusers [7e091c7519] refiner: none vae: none te: none unet: none

Model Data

Model: Diffusers/Kwai-Kolors/Kolors-diffusers
Type: KolorsPipeline
Class: KolorsPipeline
Size: 0 bytes
Modified: 2025-06-26 22:08:35

SD.Next dev 2025-06-29

Module

Class

Device

DType

Params

Modules

Config

vae

AutoencoderKL

cpu

torch.float32

83653863

243

FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 4, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.13025, 'shift_factor': None, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, '_use_default_values': ['mid_block_add_attention', 'latents_mean', 'use_quant_conv', 'use_post_quant_conv', 'latents_std', 'shift_factor', 'force_upcast'], '_class_name': 'AutoencoderKL', '_diffusers_version': '0.18.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/vae'})

text_encoder

ChatGLMModel

cpu

torch.float32

6243584000

316

ChatGLMConfig { "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification" }, "bias_dropout_fusion": true, "classifier_dropout": null, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float32", "transformers_version": "4.53.0", "use_cache": true, "vocab_size": 65024 }

tokenizer

ChatGLMTokenizer

None

None

0

0

None

unet

UNet2DConditionModel

cpu

torch.float32

2579458820

1931

FrozenDict({'sample_size': 128, 'in_channels': 4, 'out_channels': 4, 'center_input_sample': False, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['DownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'up_block_types': ['CrossAttnUpBlock2D', 'CrossAttnUpBlock2D', 'UpBlock2D'], 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'dropout': 0.0, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 2048, 'transformer_layers_per_block': [1, 2, 10], 'reverse_transformer_layers_per_block': None, 'encoder_hid_dim': 4096, 'encoder_hid_dim_type': 'text_proj', 'attention_head_dim': [5, 10, 20], 'num_attention_heads': None, 'dual_cross_attention': False, 'use_linear_projection': True, 'class_embed_type': None, 'addition_embed_type': 'text_time', 'addition_time_embed_dim': 256, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'resnet_skip_time_act': False, 'resnet_out_scale_factor': 1.0, 'time_embedding_type': 'positional', 'time_embedding_dim': None, 'time_embedding_act_fn': None, 'timestep_post_act': None, 'time_cond_proj_dim': None, 'conv_in_kernel': 3, 'conv_out_kernel': 3, 'projection_class_embeddings_input_dim': 5632, 'attention_type': 'default', 'class_embeddings_concat': False, 'mid_block_only_cross_attention': None, 'cross_attention_norm': None, 'addition_embed_type_num_heads': 64, '_class_name': 'UNet2DConditionModel', '_diffusers_version': '0.27.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/unet'})

scheduler

EulerDiscreteScheduler

None

None

0

0

FrozenDict({'num_train_timesteps': 1100, 'beta_start': 0.00085, 'beta_end': 0.014, 'beta_schedule': 'scaled_linear', 'trained_betas': None, 'prediction_type': 'epsilon', 'interpolation_type': 'linear', 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'sigma_min': None, 'sigma_max': None, 'timestep_spacing': 'leading', 'timestep_type': 'discrete', 'steps_offset': 1, 'rescale_betas_zero_snr': False, 'final_sigmas_type': 'zero', '_use_default_values': ['use_exponential_sigmas', 'final_sigmas_type', 'timestep_type', 'sigma_min', 'sigma_max', 'use_beta_sigmas'], '_class_name': 'EulerDiscreteScheduler', '_diffusers_version': '0.18.0.dev0', 'clip_sample': False, 'clip_sample_range': 1.0, 'dynamic_thresholding_ratio': 0.995, 'sample_max_value': 1.0, 'set_alpha_to_one': False, 'skip_prk_steps': True, 'thresholding': False})

image_encoder

NoneType

None

None

0

0

None

feature_extractor

NoneType

None

None

0

0

None

force_zeros_for_empty_prompt

bool

None

None

0

0

None

_name_or_path

str

None

None

0

0

None

_class_name

str

None

None

0

0

None

_diffusers_version

str

None

None

0

0

None