Intro
App: https://github.com/vladmandic/sdnext/tree/dev Version 2025-07-040 (ipex)
Model: https://huggingface.co/Kwai-Kolors/Kolors
HW: Intel core i7 1355U Intel Xe Graphics iGPU, 96GB DDR5 5600 CL46 RAM
Part 1 - Bookshop
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 16| Size: 1024x1024| Seed: 3033194654| CFG scale: 6| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline
Execution: Time: 15m 5.29s | total 906.63 pipeline 869.25 decode 36.00 preview 1.34 | RAM 44.6 GB 47%
| STEPS: 4 | STEPS: 8 | STEPS: 16 | STEPS: 20 | STEPS: 32 | |
|---|---|---|---|---|---|
| CFG0 | |||||
| CFG1 | |||||
| CFG2 | |||||
| CFG3 | |||||
| CFG4 | |||||
| CFG5 | |||||
| CFG6 | |||||
| CFG7 | |||||
| CFG8 | |||||
| CFG9 |
Part 2 - Face and hand
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 20| Size: 1024x1024| Seed: 3317287141| CFG scale: 6| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline
processing | 12.1/60.8s
Execution: Time: 18m 39.49s | total 1120.96 pipeline 1083.47 decode 35.96 preview 1.47 | RAM 61.37 GB 65%
| 12 | 16 | 20 | 24 | |
|---|---|---|---|---|
| CFG=1 | ||||
| CFG=2 | ||||
| CFG=3 | ||||
| CFG=5 | ||||
| CFG=8 |
Part 3 - Legs and ribbon
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
Parameters: Steps: 20| Size: 1024x1024| Seed: 3033194654| CFG scale: 8| Model: Kolors-diffusers| App: SD.Next| Version: 1a3b6e3| Operations: txt2img| Pipeline: KolorsPipeline
| 12 | 16 | 20 | 24 | |
|---|---|---|---|---|
| CFG=1 | ||||
| CFG=2 | ||||
| CFG=3 | ||||
| CFG=5 | ||||
| CFG=8 |
Is there way to draw legs correct?
- trying several random seeds - may get a better result
- trying different resolutions
- trying different sampler
- trying negative prompts
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
trying different resolutions and samplers
Seed 101684518 found to be better than 3033194654 just by run several generation with random seeds
| Steps: 12| Size: 1024x1024| Seed: 101684518| CFG scale: 3 | Steps: 12| Size: 768x768| Seed: 101684518| CFG scale: 3 | Steps: 12| Size: 512x512| Seed: 101684518| CFG scale: 3 | |
|---|---|---|---|
Default (Euler) | Time: 11m 22.99s | Time: 6m 38.81s | |
Heun | Time: 5m 53.04s | ||
LCM | Time: 3m 17.69s | ||
DDIM | Time: 3m 17.08s | ||
DPM++ 2M | Time: 3m 13.02s | ||
DPM++ 1S | Time: 3m 15.26s | ||
DPM++ 2M SDE | Time: 3m 18.49s | ||
KDPM2 | Time: 5m 55.50s |
System Info
app: sdnext updated: 2025-07-04 hash: 1a3b6e3b url: https://github.com/vladmandic/sdnext/tree/dev arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic python: 3.12.3 Torch 2.7.1+xpu device: Intel(R) Iris(R) Xe Graphics (iGPU) openvino: 2025.2.0 ram: free:31.06 used:62.91 total:93.97 gpu: total:93.97 xformers: diffusers: 0.35.0.dev0 transformers: 4.53.0 active: cpu dtype: torch.float32 vae: torch.float32 unet: torch.float32 base: Diffusers/Kwai-Kolors/Kolors-diffusers [7e091c7519] refiner: none vae: none te: none unet: none
Model Data
Model: Diffusers/Kwai-Kolors/Kolors-diffusers Type: KolorsPipeline Class: KolorsPipeline Size: 0 bytes Modified: 2025-06-26 22:08:35
SD.Next dev 2025-06-29
Module | Class | Device | DType | Params | Modules | Config |
|---|---|---|---|---|---|---|
vae | AutoencoderKL | cpu | torch.float32 | 83653863 | 243 | FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 4, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.13025, 'shift_factor': None, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, '_use_default_values': ['mid_block_add_attention', 'latents_mean', 'use_quant_conv', 'use_post_quant_conv', 'latents_std', 'shift_factor', 'force_upcast'], '_class_name': 'AutoencoderKL', '_diffusers_version': '0.18.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/vae'}) |
text_encoder | ChatGLMModel | cpu | torch.float32 | 6243584000 | 316 | ChatGLMConfig { "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification" }, "bias_dropout_fusion": true, "classifier_dropout": null, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float32", "transformers_version": "4.53.0", "use_cache": true, "vocab_size": 65024 } |
tokenizer | ChatGLMTokenizer | None | None | 0 | 0 | None |
unet | UNet2DConditionModel | cpu | torch.float32 | 2579458820 | 1931 | FrozenDict({'sample_size': 128, 'in_channels': 4, 'out_channels': 4, 'center_input_sample': False, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['DownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'up_block_types': ['CrossAttnUpBlock2D', 'CrossAttnUpBlock2D', 'UpBlock2D'], 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'dropout': 0.0, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 2048, 'transformer_layers_per_block': [1, 2, 10], 'reverse_transformer_layers_per_block': None, 'encoder_hid_dim': 4096, 'encoder_hid_dim_type': 'text_proj', 'attention_head_dim': [5, 10, 20], 'num_attention_heads': None, 'dual_cross_attention': False, 'use_linear_projection': True, 'class_embed_type': None, 'addition_embed_type': 'text_time', 'addition_time_embed_dim': 256, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'resnet_skip_time_act': False, 'resnet_out_scale_factor': 1.0, 'time_embedding_type': 'positional', 'time_embedding_dim': None, 'time_embedding_act_fn': None, 'timestep_post_act': None, 'time_cond_proj_dim': None, 'conv_in_kernel': 3, 'conv_out_kernel': 3, 'projection_class_embeddings_input_dim': 5632, 'attention_type': 'default', 'class_embeddings_concat': False, 'mid_block_only_cross_attention': None, 'cross_attention_norm': None, 'addition_embed_type_num_heads': 64, '_class_name': 'UNet2DConditionModel', '_diffusers_version': '0.27.0.dev0', '_name_or_path': 'models/Diffusers/models--Kwai-Kolors--Kolors-diffusers/snapshots/7e091c75199e910a26cd1b51ed52c28de5db3711/unet'}) |
scheduler | EulerDiscreteScheduler | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1100, 'beta_start': 0.00085, 'beta_end': 0.014, 'beta_schedule': 'scaled_linear', 'trained_betas': None, 'prediction_type': 'epsilon', 'interpolation_type': 'linear', 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'sigma_min': None, 'sigma_max': None, 'timestep_spacing': 'leading', 'timestep_type': 'discrete', 'steps_offset': 1, 'rescale_betas_zero_snr': False, 'final_sigmas_type': 'zero', '_use_default_values': ['use_exponential_sigmas', 'final_sigmas_type', 'timestep_type', 'sigma_min', 'sigma_max', 'use_beta_sigmas'], '_class_name': 'EulerDiscreteScheduler', '_diffusers_version': '0.18.0.dev0', 'clip_sample': False, 'clip_sample_range': 1.0, 'dynamic_thresholding_ratio': 0.995, 'sample_max_value': 1.0, 'set_alpha_to_one': False, 'skip_prk_steps': True, 'thresholding': False}) |
image_encoder | NoneType | None | None | 0 | 0 | None |
feature_extractor | NoneType | None | None | 0 | 0 | None |
force_zeros_for_empty_prompt | bool | None | None | 0 | 0 | None |
_name_or_path | str | None | None | 0 | 0 | None |
_class_name | str | None | None | 0 | 0 | None |
_diffusers_version | str | None | None | 0 | 0 | None |










