https://huggingface.co/OPPOer/Qwen-Image-Pruning
https://huggingface.co/OPPOer/Qwen-Image-Pruning/tree/main/Qwen-Image-13B-8steps

Base model test is Test 40 - Qwen Image
model_name = "OPPOer/Qwen-Image-Pruning"
positive_magic = {"en": ", Ultra HD, 4K, cinematic composition.", # for english prompt,
"zh": ",超清,4K,电影级构图。" # for chinese prompt,
}
image = pipe(
width=1328,
height=1328,
num_inference_steps=8,
true_cfg_scale=1,
generator=torch.Generator(device="cuda").manual_seed(42)
).images[0]
|
| STEP 8, AG 1 | Seed: 1620085323 | Seed:1931701040 | Seed:4075624134 | Seed:2736029172 |
|---|---|---|---|---|
bookshop girl |
|
|
|
|
hand and face |
|
|
|
|
legs and shoes |
|
|
|
|
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 8| Size: 1024x1024| Seed: 2736029172| CFG scale: 6| CFG true: 1| App: SD.Next| Version: f71db69| Pipeline: QwenImagePipeline| Operations: txt2img| Model: Qwen-Image-Pruning+Qwen-Image-13B-8steps
Time: 2m 34.69s | total 204.33 pipeline 154.66 vae 13.70 onload 12.04 offload 11.29 preview 5.63 te 5.50 callback 1.47 | GPU 45002 MB 37% | RAM 60.66 GB 48%
| 6 | 8 | 12 | 16 | 24 | |
|---|---|---|---|---|---|
AG0 |
|
|
|
|
|
AG1 |
|
|
|
|
|
AG1.1 |
|
|
|
|
|
AG1.5 |
|
|
|
|
|
AG2 |
|
|
|
|
|
AG2.5 |
|
|
|
|
|
AG4 |
|
|
|
|
|
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 8| Size: 1024x1024| Seed: 2736029172| CFG scale: 6| CFG true: 1| App: SD.Next| Version: f71db69| Pipeline: QwenImagePipeline| Operations: txt2img| Model: Qwen-Image-Pruning+Qwen-Image-13B-8steps
Time: 2m 44.59s | total 226.95 pipeline 164.54 offload 22.21 te 14.32 vae 12.85 onload 7.87 preview 3.60 callback 1.50 | GPU 44860 MB 37% | RAM 59.22 GB 47%
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 50| Size: 1024x1024| Seed: 2736029172| CFG scale: 6| CFG true: 1| App: SD.Next| Version: f71db69| Pipeline: QwenImagePipeline| Operations: txt2img| Model: Qwen-Image-Pruning+Qwen-Image-13B-8steps
Time: 13m 49.08s | total 921.63 pipeline 829.05 preview 42.28 vae 12.87 onload 11.96 offload 10.43 callback 9.31 te 5.70 | GPU 45136 MB 38% | RAM 60.71 GB 48%
| 6 | 8 | 16 | 24 | 50 | |
|---|---|---|---|---|---|
AG0 |
|
|
|
|
|
AG1 |
|
|
|
|
|
AG2 |
|
|
|
|
|
AG4 |
|
|
|
|
|
AG6 |
|
|
|
|
|
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
Parameters: Steps: 6| Size: 1024x1024| Seed: 2736029172| CFG scale: 6| CFG true: 1| App: SD.Next| Version: f71db69| Pipeline: QwenImagePipeline| Operations: txt2img| Model: Qwen-Image-Pruning+Qwen-Image-13B-8steps
Time: 2m 4.67s | total 170.58 pipeline 124.64 vae 12.79 onload 12.11 offload 10.31 te 5.66 preview 3.93 callback 1.11 | GPU 45134 MB 38% | RAM 57.0 GB 45%
| 4 | 8 | 16 | 32 | |
|---|---|---|---|---|
AG0 |
|
|
|
|
AG1 |
|
|
|
|
AG2 |
|
|
|
|
AG4 |
|
|
|
|
AG8 |
|
|
|
|
Prompts from: https://civitai.com/user/liutyi/collections
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Tue Aug 5 20:35:18 2025 app: sdnext.git updated: 2025-10-08 hash: 71615ac4a url: https://github.com/liutyi/sdnext.git/tree/ipex arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-33-generic python: 3.12.3 Torch: 2.7.1+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: 2.7.10+xpu ram: free:111.45 used:13.88 total:125.33 gpu: free:69.84 used:47.53 total:117.37 gpu-active: current:41.38 peak:43.15 gpu-allocated: current:41.38 peak:43.15 gpu-reserved: current:47.53 peak:47.53 gpu-inactive: current:0.3 peak:1.84 events: retries:0 oom:0 utilization: 0 xformers: diffusers: 0.36.0.dev0 transformers: 4.56.2 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: OPPOer/Qwen-Image-Pruning+Qwen-Image-13B refiner: none vae: none te: none unet: none Backend: ipex Pipeline: native Memory optimization: none Cross-attention: Scaled-Dot-Product |
{
"diffusers_version": "2b7deffe361b7b0e1d2665a1f9f0bd4daea4927c",
"sd_model_checkpoint": "OPPOer/Qwen-Image-Pruning+Qwen-Image-13B",
"sd_checkpoint_hash": null,
"disabled_extensions": [
"sd-webui-agent-scheduler"
],
"diffusers_to_gpu": true,
"device_map": "gpu",
"diffusers_offload_mode": "none",
"interrogate_blip_model": "blip2-opt-6.7b",
"huggingface_token": "hf_xxx",
"queue_paused": true |
| Module | Class | Device | Dtype | Quant | Params | Modules | Config |
|---|---|---|---|---|---|---|---|
| vae | AutoencoderKLQwenImage | xpu:0 | torch.bfloat16 | None | 126892531 | 260 | FrozenDict({'base_dim': 96, 'z_dim': 16, 'dim_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_scales': [], 'temperal_downsample': [False, True, True], 'dropout': 0.0, 'latents_mean': [-0.7571, -0.7089, -0.9113, 0.1075, -0.1745, 0.9653, -0.1517, 1.5508, 0.4134, -0.0715, 0.5517, -0.3632, -0.1922, -0.9497, 0.2503, -0.2921], 'latents_std': [2.8184, 1.4541, 2.3275, 2.6558, 1.2196, 1.7708, 2.6052, 2.0743, 3.2687, 2.1526, 2.8652, 1.5579, 1.6382, 1.1253, 2.8251, 1.916], '_class_name': 'AutoencoderKLQwenImage', '_diffusers_version': '0.34.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--Qwen--Qwen-Image/snapshots/75e0b4be04f60ec59a75f475837eced720f823b6/vae'}) |
| text_encoder | Qwen2_5_VLForConditionalGeneration | xpu:0 | torch.bfloat16 | None | 8292166656 | 763 | Qwen2_5_VLConfig { "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "text_config": { "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": null, "initializer_range": 0.02, "intermediate_size": 18944, "layer_types": [ "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention" ], "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl_text", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": null, "use_cache": true, "use_sliding_window": false, "video_token_id": null, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }, "tie_word_embeddings": false, "transformers_version": "4.56.2", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "depth": 32, "dtype": "bfloat16", "fullatt_block_indexes": [ 7, 15, 23, 31 ], "hidden_act": "silu", "hidden_size": 1280, "in_channels": 3, "in_chans": 3, "initializer_range": 0.02, "intermediate_size": 3420, "model_type": "qwen2_5_vl", "num_heads": 16, "out_hidden_size": 3584, "patch_size": 14, "spatial_merge_size": 2, "spatial_patch_size": 14, "temporal_patch_size": 2, "tokens_per_second": 2, "window_size": 112 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 } |
| tokenizer | Qwen2Tokenizer | None | None | None | 0 | 0 | None |
| transformer | QwenImageTransformer2DModel | xpu:0 | torch.bfloat16 | None | 13633775168 | 1537 | FrozenDict({'patch_size': 2, 'in_channels': 64, 'out_channels': 16, 'num_layers': 40, 'attention_head_dim': 128, 'num_attention_heads': 24, 'joint_attention_dim': 3584, 'guidance_embeds': False, 'axes_dims_rope': [16, 56, 56], '_class_name': 'QwenImageTransformer2DModel', '_diffusers_version': '0.36.0.dev0', 'pooled_projection_dim': 768, '_name_or_path': 'OPPOer/Qwen-Image-Pruning'}) |
| scheduler | FlowMatchEulerDiscreteScheduler | None | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'shift': 1.0, 'use_dynamic_shifting': True, 'base_shift': 0.5, 'max_shift': 0.9, 'base_image_seq_len': 256, 'max_image_seq_len': 8192, 'invert_sigmas': False, 'shift_terminal': 0.02, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'stochastic_sampling': False, '_class_name': 'FlowMatchEulerDiscreteScheduler', '_diffusers_version': '0.34.0.dev0'}) |