| Info |
|---|
DRAFT |
Info
https://github.com/vladmandic/sdnext/wiki/HiDream
...
Prompt: A Nice woman is applying red nail polish on toenails at the windowsill of a New York high-rise building with a view of the night city.
| 4 | 8 | 16 | 28 | 32 | |||
|---|---|---|---|---|---|---|---|
| 0 | |||||||
| 1 | |||||||
| 2 | |||||||
| 3 | 4 |
Test 1 - Bookshop
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
...
| 8 | 16 | 2428 | 32 | |
|---|---|---|---|---|
| CFG0.5 | ||||
| CFG1 | ||||
| CFG1.5 | ||||
| CFG2 | ||||
| CFG2.5 |
Test 2 - Face and hand
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 32| Size: 1024x1024| Seed: 2423417735| Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipeline
Execution: Time: 13m 29.71s | pipeline 763.85 move 31.70 prompt 31.66 offload 21.53 te 19.76 decode 14.14 | GPU 38086 MB 30% | RAM 58.64 GB 47%
| 8 | 16 |
|---|
| 28 | 32 | |||
|---|---|---|---|---|
| 0.5 | ||||
| 1 | ||||
| 1.5 | ||||
| 2 | ||||
| 2.5 |
Test 3 - Legs
| 8 | 16 | 24 | 32 | |
|---|---|---|---|---|
| CFG0.5 | ||||
| CFG1 | ||||
| CFG1.5 | ||||
| CFG2 | ||||
| CFG2.5 |
System Info
| Code Block |
|---|
app: sdnext.git updated: 2025-06-16 hash: 72eb0132 url: https://github.com/vladmandic/sdnext.git/tree/master
arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic python: 3.12.3
Torch 2.7.1+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex:
ram: free:123.91 used:1.42 total:125.33
xformers: diffusers: 0.34.0.dev0 transformers: 4.52.4
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: Diffusers/HiDream-ai/HiDream-I1-Dev [5b3f48f0d6] refiner: none vae: none te: none unet: none |
Tech details
| Code Block |
|---|
Model: Diffusers/HiDream-ai/HiDream-I1-Dev Type: h1 Class: HiDreamImagePipeline Size: 0 bytes Modified: 2025-06-08 21:49:25 |
Module | Class | Device | DType | Params | Modules | Config |
|---|---|---|---|---|---|---|
vae | AutoencoderKL | cpu | torch.bfloat16 | 83819683 | 241 | FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 16, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.3611, 'shift_factor': 0.1159, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': False, 'use_post_quant_conv': False, 'mid_block_add_attention': True, '_class_name': 'AutoencoderKL', '_diffusers_version': '0.30.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--HiDream-ai--HiDream-I1-Dev/snapshots/0fad2ea0ccf9a80ddf019ea777eedb27c1ccb232/vae'}) |
text_encoder | CLIPTextModelWithProjection | xpu:0 | torch.bfloat16 | 123781632 | 153 | CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 248, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "projection_dim": 768, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 } |
text_encoder_2 | CLIPTextModelWithProjection | xpu:0 | torch.bfloat16 | 694840320 | 393 | CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "gelu", "hidden_size": 1280, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 5120, "layer_norm_eps": 1e-05, "max_position_embeddings": 218, "model_type": "clip_text_model", "num_attention_heads": 20, "num_hidden_layers": 32, "pad_token_id": 1, "projection_dim": 1280, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 } |
text_encoder_3 | T5EncoderModel | cpu | torch.bfloat16 | 4762310656 | 463 | T5Config { "architectures": [ "T5EncoderModel" ], "classifier_dropout": 0.0, "d_ff": 10240, "d_kv": 64, "d_model": 4096, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "t5", "num_decoder_layers": 24, "num_heads": 64, "num_layers": 24, "output_past": true, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 32128 } |
text_encoder_4 | LlamaForCausalLM | cpu | torch.bfloat16 | 8030261248 | 423 | LlamaConfig { "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 128000, "eos_token_id": [ 128001, 128008, 128009 ], "head_dim": 128, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 131072, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 8, "output_attentions": true, "output_hidden_states": true, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 8.0, "high_freq_factor": 4.0, "low_freq_factor": 1.0, "original_max_position_embeddings": 8192, "rope_type": "llama3" }, "rope_theta": 500000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 128256 } |
tokenizer | CLIPTokenizer | 0 | 0 | None | ||
tokenizer_2 | CLIPTokenizer | 0 | 0 | None | ||
tokenizer_3 | T5Tokenizer | 0 | 0 | None | ||
tokenizer_4 | PreTrainedTokenizerFast | 0 | 0 | None | ||
scheduler | FlowMatchLCMScheduler | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'shift': 6.0, 'use_dynamic_shifting': False, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'scale_factors': None, 'upscale_mode': 'bicubic', '_class_name': 'FlowMatchLCMScheduler', '_diffusers_version': '0.34.0.dev0'}) | ||
transformer | HiDreamImageTransformer2DModel | xpu:0 | torch.bfloat16 | 17105733184 | 2090 | FrozenDict({'patch_size': 2, 'in_channels': 16, 'out_channels': 16, 'num_layers': 16, 'num_single_layers': 32, 'attention_head_dim': 128, 'num_attention_heads': 20, 'caption_channels': [4096, 4096], 'text_emb_dim': 2048, 'num_routed_experts': 4, 'num_activated_experts': 2, 'axes_dims_rope': [64, 32, 32], 'max_resolution': [128, 128], 'llama_layers': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31], 'force_inference_output': False, '_use_default_values': ['force_inference_output'], '_class_name': 'HiDreamImageTransformer2DModel', '_diffusers_version': '0.32.1', '_name_or_path': 'HiDream-ai/HiDream-I1-Dev'}) |
_name_or_path | str | 0 | 0 | None | ||
_class_name | str | 0 | 0 | None | ||
_diffusers_version | str | 0 | 0 | None |
Speed of Dev compared with Full
prompt: car
| HiDream-I1-Dev [5b3f48f0d6] FlowMatchLCMScheduler, 1024x1024 | HiDream-I1-Full [fe6156b63d] UniPCMultistepScheduler, 1024x1024 | ||
|---|---|---|---|
| Time: 23m 2.63s pipeline 1328.75 move 37.65 prompt 37.55 te 35.38 decode 16.19 offload 12.64 GPU 37726 MB 29% | RAM 58.53 GB 47% | Time: 40m 41.05s pipeline 2391.21 move 32.94 prompt 32.89 te 30.65 decode 16.86 offload 14.00 GPU 37726 MB 29% | RAM 58.53 GB 47% | Time: 22m 37.61s pipeline 1298.06 move 44.98 prompt 44.07 te 41.80 decode 14.53 offload 12.12 GPU 37724 MB 29% | RAM 58.57 GB 47% | Time: 39m 13.08s |
| Steps: 28| Size: 512x512| Seed: 2423417735| CFG scale: 2| Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipeline | Parameters: Steps: 50| Size: 1024x1024| Seed: 2423417735| CFG scale: 2| Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipeline | Steps: 28| Size: 512x512| Seed: 2423417735| CFG scale: 2| Model: HiDream-I1-Full| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipeline | Steps: 50| Size: 1024x1024| Seed: 2423417735| CFG scale: 2| |
...



