Info
DRAFT

Info

https://github.com/vladmandic/sdnext/wiki/HiDream

...

Prompt: A Nice woman is applying red nail polish on toenails at the windowsill of a New York high-rise building with a view of the night city.

	4	8	16	28	32
0	Image Added	Image Added	Image Added	Image Added	Image Added
1	Image Added	Image Added	Image Added	Image Added	Image Added
2	Image Added	Image Added	Image Added	Image Added	Image Added
3	4		Image Added	Image Added	Image Added	Image Added	Image Added 5

Test 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

...

	8	16	2428	32
CFG0.5	Image Added	Image Added	Image Added	Image Added
CFG1	Image Added	Image Added	Image Added	Image Added
CFG1.5	Image Added	Image Added	Image Added	Image Added
CFG2	Image Added	Image Added	Image Added	Image Added
CFG2.5	Image Added	Image Added	Image Added	Image Added

Test 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Execution: Time: 13m 29.71s | pipeline 763.85 move 31.70 prompt 31.66 offload 21.53 te 19.76 decode 14.14 | GPU 38086 MB 30% | RAM 58.64 GB 47%

	8	16

2432CFG1

28	32
0.5	Image Added	Image Added	Image Added	Image Added
1	Image Added	Image Added	Image Added	Image Added
1.5	Image Added	Image Added	Image Added	Image Added
2	Image Added	Image Added	Image Added	Image Added
2.5	Image Added	Image Added	Image Added	Image Added

Test 3 - Legs

	8	16	24	32
CFG0.5	Image Added	Image Added	Image Added	Image Added
CFG1	Image Added	Image Added	Image Added	Image Added
CFG1.5	Image Added	Image Added	Image Added	Image Added
CFG2	Image Added	Image Added	Image Added	Image Added
CFG2.5	Image Added	Image Added	Image Added	Image Added

System Info

Code Block

app: sdnext.git updated: 2025-06-16 hash: 72eb0132 url: https://github.com/vladmandic/sdnext.git/tree/master
arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic python: 3.12.3
Torch 2.7.1+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex: 
ram: free:123.91 used:1.42 total:125.33
xformers: diffusers: 0.34.0.dev0 transformers: 4.52.4
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: Diffusers/HiDream-ai/HiDream-I1-Dev [5b3f48f0d6] refiner: none vae: none te: none unet: none

Tech details

Code Block
Model: Diffusers/HiDream-ai/HiDream-I1-Dev Type: h1 Class: HiDreamImagePipeline Size: 0 bytes Modified: 2025-06-08 21:49:25

Module	Class	Device	DType	Params	Modules	Config
vae	AutoencoderKL	cpu	torch.bfloat16	83819683	241	FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 16, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.3611, 'shift_factor': 0.1159, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': False, 'use_post_quant_conv': False, 'mid_block_add_attention': True, '_class_name': 'AutoencoderKL', '_diffusers_version': '0.30.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--HiDream-ai--HiDream-I1-Dev/snapshots/0fad2ea0ccf9a80ddf019ea777eedb27c1ccb232/vae'})
text_encoder	CLIPTextModelWithProjection	xpu:0	torch.bfloat16	123781632	153	CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 248, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "projection_dim": 768, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 }
text_encoder_2	CLIPTextModelWithProjection	xpu:0	torch.bfloat16	694840320	393	CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "gelu", "hidden_size": 1280, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 5120, "layer_norm_eps": 1e-05, "max_position_embeddings": 218, "model_type": "clip_text_model", "num_attention_heads": 20, "num_hidden_layers": 32, "pad_token_id": 1, "projection_dim": 1280, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 }
text_encoder_3	T5EncoderModel	cpu	torch.bfloat16	4762310656	463	T5Config { "architectures": [ "T5EncoderModel" ], "classifier_dropout": 0.0, "d_ff": 10240, "d_kv": 64, "d_model": 4096, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "t5", "num_decoder_layers": 24, "num_heads": 64, "num_layers": 24, "output_past": true, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 32128 }
text_encoder_4	LlamaForCausalLM	cpu	torch.bfloat16	8030261248	423	LlamaConfig { "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 128000, "eos_token_id": [ 128001, 128008, 128009 ], "head_dim": 128, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 131072, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 8, "output_attentions": true, "output_hidden_states": true, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 8.0, "high_freq_factor": 4.0, "low_freq_factor": 1.0, "original_max_position_embeddings": 8192, "rope_type": "llama3" }, "rope_theta": 500000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 128256 }
tokenizer	CLIPTokenizer			0	0	None
tokenizer_2	CLIPTokenizer			0	0	None
tokenizer_3	T5Tokenizer			0	0	None
tokenizer_4	PreTrainedTokenizerFast			0	0	None
scheduler	FlowMatchLCMScheduler			0	0	FrozenDict({'num_train_timesteps': 1000, 'shift': 6.0, 'use_dynamic_shifting': False, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'scale_factors': None, 'upscale_mode': 'bicubic', '_class_name': 'FlowMatchLCMScheduler', '_diffusers_version': '0.34.0.dev0'})
transformer	HiDreamImageTransformer2DModel	xpu:0	torch.bfloat16	17105733184	2090	FrozenDict({'patch_size': 2, 'in_channels': 16, 'out_channels': 16, 'num_layers': 16, 'num_single_layers': 32, 'attention_head_dim': 128, 'num_attention_heads': 20, 'caption_channels': [4096, 4096], 'text_emb_dim': 2048, 'num_routed_experts': 4, 'num_activated_experts': 2, 'axes_dims_rope': [64, 32, 32], 'max_resolution': [128, 128], 'llama_layers': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31], 'force_inference_output': False, '_use_default_values': ['force_inference_output'], '_class_name': 'HiDreamImageTransformer2DModel', '_diffusers_version': '0.32.1', '_name_or_path': 'HiDream-ai/HiDream-I1-Dev'})
_name_or_path	str			0	0	None
_class_name	str			0	0	None
_diffusers_version	str			0	0	None

Speed of Dev compared with Full

prompt: car

HiDream-I1-Dev [5b3f48f0d6] FlowMatchLCMScheduler, 1024x1024		HiDream-I1-Full [fe6156b63d] UniPCMultistepScheduler, 1024x1024

Time: 23m 2.63s pipeline 1328.75 move 37.65 prompt 37.55 te 35.38 decode 16.19 offload 12.64 GPU 37726 MB 29% \| RAM 58.53 GB 47%	Time: 40m 41.05s pipeline 2391.21 move 32.94 prompt 32.89 te 30.65 decode 16.86 offload 14.00 GPU 37726 MB 29% \| RAM 58.53 GB 47%	Time: 22m 37.61s pipeline 1298.06 move 44.98 prompt 44.07 te 41.80 decode 14.53 offload 12.12 GPU 37724 MB 29% \| RAM 58.57 GB 47%	Time: 39m 13.08s pipeline 2315.96 move 22.51 prompt 22.46 te 20.28 decode 14.57 offload 12.04 GPU 37942 MB 30% \| RAM 58.6 GB 47%
Steps: 28\| Size: 512x512\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Dev\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Parameters: Steps: 50\| Size: 1024x1024\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Dev\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Steps: 28\| Size: 512x512\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Full\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Steps: 50\| Size: 1024x1024\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Full\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline

...

HiDream-I1-Dev [5b3f48f0d6] FlowMatchLCMScheduler, 1024x1024		HiDream-I1-Full [fe6156b63d] UniPCMultistepScheduler, 1024x1024

Time: 23m 2.63s pipeline 1328.75 move 37.65 prompt 37.55 te 35.38 decode 16.19 offload 12.64 GPU 37726 MB 29% \| RAM 58.53 GB 47%	Time: 40m 41.05s pipeline 2391.21 move 32.94 prompt 32.89 te 30.65 decode 16.86 offload 14.00 GPU 37726 MB 29% \| RAM 58.53 GB 47%	Time: 22m 37.61s pipeline 1298.06 move 44.98 prompt 44.07 te 41.80 decode 14.53 offload 12.12 GPU 37724 MB 29% \| RAM 58.57 GB 47%	Time: 39m 13.08s pipeline 2315.96 move 22.51 prompt 22.46 te 20.28 decode 14.57 offload 12.04 GPU 37942 MB 30% \| RAM 58.6 GB 47%
Steps: 28\| Size: 512x512\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Dev\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Parameters: Steps: 50\| Size: 1024x1024\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Dev\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Steps: 28\| Size: 512x512\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Full\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline	Steps: 50\| Size: 1024x1024\| Seed: 2423417735\| CFG scale: 2\| Model: HiDream-I1-Full\| App: SD.Next\| Version: 72eb013\| Operations: txt2img\| Pipeline: HiDreamImagePipeline

Page tree

Versions Compared

Old Version 3

New Version Current

Key

Info

Test 1 - Bookshop

Test 2 - Face and hand

Test 3 - Legs

System Info

Tech details

Speed of Dev compared with Full

Page tree

Page History

Versions Compared

Old Version 3

New Version Current

Key

Info

Test 1 - Bookshop

Test 2 - Face and hand

Test 3 - Legs

System Info

Tech details

Speed of Dev compared with Full