Info
| high noice | low noise | combined | |
|---|---|---|---|
WAN 2.2. A14B 512px | Prompt: A woman with long flowing hair and a sleek black dress silhouette dances gracefully in front of a large floor-to-ceiling window, her movements fluid and expressive. The window reveals a breathtaking panoramic view of a sprawling winter city at night, snow-covered rooftops gleam beneath dim streetlights, neon signs flicker across icy facades, and distant bridges reflect shimmering lights on frozen rivers. Frost patterns adorn the glass, adding depth and realism to the scene. Above the dancer, image wide bold red text reads "WAN 2.2"; smaller black text in a cyan circle "14B" appears in the upper-left corner, subtly indicating version information. The overall mood is serene yet dynamic, blending elegance with urban coldness, illuminated by warm contrasts between the dancer’s motion and the stark beauty of the nighttime landscape. Parameters: Steps: 20| Size: 512x512| Seed: 324| CFG scale: 6| App: SD.Next| Version: 6ea881b| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers Time: 2m 23.23s | total 191.41 pipeline 143.20 preview 45.54 te 1.49 vae 1.15 | GPU 39818 MB 32% | RAM 42.71 GB 35% | Prompt: A woman with long flowing hair and a sleek black dress silhouette dances gracefully in front of a large floor-to-ceiling window, her movements fluid and expressive. The window reveals a breathtaking panoramic view of a sprawling winter city at night, snow-covered rooftops gleam beneath dim streetlights, neon signs flicker across icy facades, and distant bridges reflect shimmering lights on frozen rivers. Frost patterns adorn the glass, adding depth and realism to the scene. Above the dancer, image wide bold red text reads "WAN 2.2"; smaller black text in a cyan circle "14B" appears in the upper-left corner, subtly indicating version information. The overall mood is serene yet dynamic, blending elegance with urban coldness, illuminated by warm contrasts between the dancer’s motion and the stark beauty of the nighttime landscape. Parameters: Steps: 20| Size: 512x512| Seed: 324| CFG scale: 6| App: SD.Next| Version: 6ea881b| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers Time: 2m 32.26s | total 207.77 pipeline 152.19 preview 51.21 te 2.95 vae 1.36 | GPU 39880 MB 32% | RAM 43.8 GB 36% | Prompt: A woman with long flowing hair and a sleek black dress silhouette dances gracefully in front of a large floor-to-ceiling window, her movements fluid and expressive. The window reveals a breathtaking panoramic view of a sprawling winter city at night, snow-covered rooftops gleam beneath dim streetlights, neon signs flicker across icy facades, and distant bridges reflect shimmering lights on frozen rivers. Frost patterns adorn the glass, adding depth and realism to the scene. Above the dancer, image wide bold red text reads "WAN 2.2"; smaller black text in a cyan circle "14B" appears in the upper-left corner, subtly indicating version information. The overall mood is serene yet dynamic, blending elegance with urban coldness, illuminated by warm contrasts between the dancer’s motion and the stark beauty of the nighttime landscape. Parameters: Steps: 20| Size: 512x512| Seed: 324| CFG scale: 6| App: SD.Next| Version: 6ea881b| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers |
https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B
https://github.com/Wan-Video/Wan2.2
Test 0 - Different seed variations
Low Noise only
| CFG6, STEP 20 | Seed: 1620085323 | Seed:1931701040 | Seed:4075624134 | Seed:2736029172 |
|---|---|---|---|---|
hand and face and bookshop girl |
Combined (Low and high Noise)
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 20| Size: 1024x1024| Seed: 1620085323| CFG scale: 6| App: SD.Next| Version: c9fdd56| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers
| CFG6, STEP 20 | Seed: 1620085323 | Seed:1931701040 | Seed:4075624134 | Seed:2736029172 |
|---|---|---|---|---|
| bookshop girl | ||||
| hand and face | ||||
| legs and shoes |
Test 1 - Bookshop
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 28| Size: 1024x1024| Seed: 4075624134| CFG scale: 1.0| App: SD.Next| Version: c021928| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers
Time: 19m 51.71s | total 1290.47 pipeline 1191.68 preview 98.20 vae 0.32 | GPU 70822 MB 55% | RAM 97.11 GB 77%
| 4 | 8 | 16 | 24 | 28 | |
|---|---|---|---|---|---|
CFG1 | |||||
CFG2 | |||||
CFG3 | |||||
CFG4 | |||||
CFG5 | |||||
CFG6 | |||||
CFG8 |
Test 2 - Face and hand
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 64| Size: 1024x1024| Seed: 1620085323| CFG scale: 5| App: SD.Next| Version: c021928| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers
Time: 89m 47.08s | total 5702.88 pipeline 5387.04 preview 309.48 te 4.07 vae 2.15 | GPU 70822 MB 55% | RAM 94.74 GB 76%
| 4 | 8 | 16 | 32 | 64 | |
|---|---|---|---|---|---|
CFG1 | |||||
CFG2 | |||||
CFG3 | |||||
CFG4 | |||||
CFG5 | |||||
CFG6 | |||||
CFG8 | |||||
CFG10 |
Test 3 - Legs
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
Parameters: Steps: 20| Size: 1024x1024| Seed: 2736029172| CFG scale: 6| App: SD.Next| Version: c021928| Pipeline: WanPipeline| Operations: txt2img| Model: Wan2.2-T2V-A14B-Diffusers
Time: 28m 2.91s | total 1729.31 pipeline 1682.88 preview 43.76 te 2.32 vae 0.32 | GPU 70822 MB 55% | RAM 97.11 GB 77%
| 8 | 16 | 20 | 32 | |
|---|---|---|---|---|
CFG1 | ||||
CFG2 | ||||
CFG3 | ||||
CFG4 | ||||
CFG5 | ||||
CFG6 | ||||
CFG8 |
Other model covers
System info
Fri Oct 24 15:48:18 2025 app: sdnext.git updated: 2025-10-23 hash: c9fdd56eb url: https://github.com/liutyi/sdnext.git/tree/pytorch arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-33-generic python: 3.12.3 pytorch 2.9.0+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: ram: free:121.55 used:3.78 total:125.33 xformers: diffusers: 0.36.0.dev0 transformers: 4.57.1 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: Diffusers/Wan-AI/Wan2.2-T2V-A14B-Diffusers [5be7df9619] refiner: none vae: none te: none unet: none ipex native Scaled-Dot-Product
and
Thu Nov 13 18:34:05 2025 app: sdnext.git updated: 2025-11-11 hash: d5eaed811 url: https://github.com/liutyi/sdnext/tree/ipex arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-35-generic python: 3.12.3 pytorch 2.7.1+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: 2.7.10+xpu ram: free:96.12 used:29.21 total:125.33 gpu: free:48.11 used:69.27 total:117.37 gpu-active: current:64.53 peak:65.78 gpu-allocated: current:64.53 peak:65.78 gpu-reserved: current:69.27 peak:69.27 gpu-inactive: current:0.02 peak:0.64 events: retries:0 oom:0 utilization: 0 xformers: diffusers: 0.36.0.dev0 transformers: 4.57.1 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: Wan-AI/Wan2.2-T2V-A14B-Diffusers refiner: none vae: none te: none unet: none ipex native Scaled-Dot-Product
Config
{
"sd_model_checkpoint": "Wan-AI/Wan2.2-T2V-A14B-Diffusers",
"diffusers_to_gpu": true,
"device_map": "gpu",
"diffusers_offload_mode": "none",
"sdnq_dequantize_compile": false,
"ui_request_timeout": 300000,
"huggingface_token": "hf..FraU",
"diffusers_version": "b3e9dfced7c9e8d00f646c710766b532383f04c6",
"sd_checkpoint_hash": "6681e8e4b134c81f16533acedb0d406d7e5e366e1624b4105178c64d00b05d51",
"civitai_token": "f1..65",
"model_wan_stage": "combined"
}
Model info
| Module | Class | Device | Dtype | Quant | Params | Modules | Config |
|---|---|---|---|---|---|---|---|
| vae | AutoencoderKLWan | cpu | torch.bfloat16 | None | 126892531 | 260 | FrozenDict({'base_dim': 96, 'decoder_base_dim': None, 'z_dim': 16, 'dim_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_scales': [], 'temperal_downsample': [False, True, True], 'dropout': 0.0, 'latents_mean': [-0.7571, -0.7089, -0.9113, 0.1075, -0.1745, 0.9653, -0.1517, 1.5508, 0.4134, -0.0715, 0.5517, -0.3632, -0.1922, -0.9497, 0.2503, -0.2921], 'latents_std': [2.8184, 1.4541, 2.3275, 2.6558, 1.2196, 1.7708, 2.6052, 2.0743, 3.2687, 2.1526, 2.8652, 1.5579, 1.6382, 1.1253, 2.8251, 1.916], 'is_residual': False, 'in_channels': 3, 'out_channels': 3, 'patch_size': None, 'scale_factor_temporal': 4, 'scale_factor_spatial': 8, '_use_default_values': ['scale_factor_spatial', 'is_residual', 'scale_factor_temporal', 'in_channels', 'patch_size', 'out_channels', 'decoder_base_dim'], '_class_name': 'AutoencoderKLWan', '_diffusers_version': '0.35.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--Wan-AI--Wan2.2-T2V-A14B-Diffusers/snapshots/5be7df9619b54f4e2667b2755bc6a756675b5cd7/vae'}) |
| text_encoder | UMT5EncoderModel | xpu:0 | torch.bfloat16 | None | 5680910336 | 486 | UMT5Config { "architectures": [ "UMT5EncoderModel" ], "classifier_dropout": 0.0, "d_ff": 10240, "d_kv": 64, "d_model": 4096, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "dtype": "bfloat16", "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "umt5", "num_decoder_layers": 24, "num_heads": 64, "num_layers": 24, "output_past": true, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "scalable_attention": true, "tie_word_embeddings": false, "tokenizer_class": "T5Tokenizer", "transformers_version": "4.57.1", "use_cache": true, "vocab_size": 256384 } |
| tokenizer | T5TokenizerFast | None | None | None | 0 | 0 | None |
| transformer | WanTransformer3DModel | xpu:0 | torch.bfloat16 | None | 14288491584 | 1138 | FrozenDict({'patch_size': [1, 2, 2], 'num_attention_heads': 40, 'attention_head_dim': 128, 'in_channels': 16, 'out_channels': 16, 'text_dim': 4096, 'freq_dim': 256, 'ffn_dim': 13824, 'num_layers': 40, 'cross_attn_norm': True, 'qk_norm': 'rms_norm_across_heads', 'eps': 1e-06, 'image_dim': None, 'added_kv_proj_dim': None, 'rope_max_seq_len': 1024, 'pos_embed_seq_len': None, '_class_name': 'WanTransformer3DModel', '_diffusers_version': '0.35.0.dev0', '_name_or_path': 'Wan-AI/Wan2.2-T2V-A14B-Diffusers'}) |
| scheduler | UniPCMultistepScheduler | None | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'beta_start': 0.0001, 'beta_end': 0.02, 'beta_schedule': 'linear', 'trained_betas': None, 'solver_order': 2, 'prediction_type': 'flow_prediction', 'thresholding': False, 'dynamic_thresholding_ratio': 0.995, 'sample_max_value': 1.0, 'predict_x0': True, 'solver_type': 'bh2', 'lower_order_final': True, 'disable_corrector': [], 'solver_p': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'use_flow_sigmas': True, 'flow_shift': 3.0, 'timestep_spacing': 'linspace', 'steps_offset': 0, 'final_sigmas_type': 'zero', 'rescale_betas_zero_snr': False, 'use_dynamic_shifting': False, 'time_shift_type': 'exponential', '_class_name': 'UniPCMultistepScheduler', '_diffusers_version': '0.35.0.dev0'}) |
| transformer_2 | WanTransformer3DModel | xpu:0 | torch.bfloat16 | None | 14288491584 | 1138 | FrozenDict({'patch_size': [1, 2, 2], 'num_attention_heads': 40, 'attention_head_dim': 128, 'in_channels': 16, 'out_channels': 16, 'text_dim': 4096, 'freq_dim': 256, 'ffn_dim': 13824, 'num_layers': 40, 'cross_attn_norm': True, 'qk_norm': 'rms_norm_across_heads', 'eps': 1e-06, 'image_dim': None, 'added_kv_proj_dim': None, 'rope_max_seq_len': 1024, 'pos_embed_seq_len': None, '_class_name': 'WanTransformer3DModel', '_diffusers_version': '0.35.0.dev0', '_name_or_path': 'Wan-AI/Wan2.2-T2V-A14B-Diffusers'}) |
| boundary_ratio | float | None | None | None | 0 | 0 | None |
| expand_timesteps | bool | None | None | None | 0 | 0 | None |


