Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

DRAFT

Info

https://github.com/vladmandic/sdnext/wiki/HiDream

...

Prompt: A Nice woman is applying red nail polish on toenails at the windowsill of a New York high-rise building with a view of the night city.


16182022

2448616828123214
0

Image Added

Image Added

Image Added

Image Added

0

Image Added

1

Image Added

Image Added

Image Added

Image Added

Image Added

2

Image Added

Image Added

Image Added

Image Added

Image Added

3

Image Added

Image Added

Image Added

Image Added

Image Added


Test 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

...



816242832
CFG0.5

Image Added

Image Added

Image Added

Image Added

CFG1

Image Added

Image Added

Image Added

Image Added

CFG1.5

Image Added

Image Added

Image Added

Image Added

CFG2

Image Added

Image Added

Image Added

Image Added

CFG2.5

Image Added

Image Added

Image Added

Image Added


Test 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Parameters: Steps: 32| Size: 1024x1024| Seed: 2423417735| Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipeline

Execution: Time: 13m 29.71s | pipeline 763.85 move 31.70 prompt 31.66 offload 21.53 te 19.76 decode 14.14 | GPU 38086 MB 30% | RAM 58.64 GB 47%


816
24
2832
CFG1
0.5

Image Added

Image Added

Image Added

Image Added

1

Image Added

Image Added

Image Added

Image Added

1.5

Image Added

Image Added

Image Added

Image Added

2

Image Added

Image Added

Image Added

Image Added

2.5

Image Added

Image Added

Image Added

Image Added

Test 3 - Legs



8162432
CFG0.5

Image Added

Image Added

Image Added

Image Added

CFG1

Image Added

Image Added

Image Added

Image Added

CFG1.5

Image Added

Image Added

Image Added

Image Added

CFG2

Image Added

Image Added

Image Added

Image Added

CFG2.5

Image Added

Image Added

Image Added

Image Added


System Info

Code Block
app: sdnext.git updated: 2025-06-16 hash: 72eb0132 url: https://github.com/vladmandic/sdnext.git/tree/master
arch: x86_64 cpu: x86_64 system: Linux release: 6.11.0-28-generic python: 3.12.3
Torch 2.7.1+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex: 
ram: free:123.91 used:1.42 total:125.33
xformers: diffusers: 0.34.0.dev0 transformers: 4.52.4
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: Diffusers/HiDream-ai/HiDream-I1-Dev [5b3f48f0d6] refiner: none vae: none te: none unet: none



Tech details

Code Block
Model: Diffusers/HiDream-ai/HiDream-I1-Dev
Type: h1
Class: HiDreamImagePipeline
Size: 0 bytes
Modified: 2025-06-08 21:49:25


Module

Class

Device

DType

Params

Modules

Config

vae

AutoencoderKL

cpu

torch.bfloat16

83819683

241

FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 16, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.3611, 'shift_factor': 0.1159, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': False, 'use_post_quant_conv': False, 'mid_block_add_attention': True, '_class_name': 'AutoencoderKL', '_diffusers_version': '0.30.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--HiDream-ai--HiDream-I1-Dev/snapshots/0fad2ea0ccf9a80ddf019ea777eedb27c1ccb232/vae'})

text_encoder

CLIPTextModelWithProjection

xpu:0

torch.bfloat16

123781632

153

CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 248, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "projection_dim": 768, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 }

text_encoder_2

CLIPTextModelWithProjection

xpu:0

torch.bfloat16

694840320

393

CLIPTextConfig { "architectures": [ "CLIPTextModelWithProjection" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "gelu", "hidden_size": 1280, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 5120, "layer_norm_eps": 1e-05, "max_position_embeddings": 218, "model_type": "clip_text_model", "num_attention_heads": 20, "num_hidden_layers": 32, "pad_token_id": 1, "projection_dim": 1280, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "vocab_size": 49408 }

text_encoder_3

T5EncoderModel

cpu

torch.bfloat16

4762310656

463

T5Config { "architectures": [ "T5EncoderModel" ], "classifier_dropout": 0.0, "d_ff": 10240, "d_kv": 64, "d_model": 4096, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "t5", "num_decoder_layers": 24, "num_heads": 64, "num_layers": 24, "output_past": true, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 32128 }

text_encoder_4

LlamaForCausalLM

cpu

torch.bfloat16

8030261248

423

LlamaConfig { "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 128000, "eos_token_id": [ 128001, 128008, 128009 ], "head_dim": 128, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 131072, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 8, "output_attentions": true, "output_hidden_states": true, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": { "factor": 8.0, "high_freq_factor": 4.0, "low_freq_factor": 1.0, "original_max_position_embeddings": 8192, "rope_type": "llama3" }, "rope_theta": 500000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.52.4", "use_cache": true, "vocab_size": 128256 }

tokenizer

CLIPTokenizer



0

0

None

tokenizer_2

CLIPTokenizer



0

0

None

tokenizer_3

T5Tokenizer



0

0

None

tokenizer_4

PreTrainedTokenizerFast



0

0

None

scheduler

FlowMatchLCMScheduler



0

0

FrozenDict({'num_train_timesteps': 1000, 'shift': 6.0, 'use_dynamic_shifting': False, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'scale_factors': None, 'upscale_mode': 'bicubic', '_class_name': 'FlowMatchLCMScheduler', '_diffusers_version': '0.34.0.dev0'})

transformer

HiDreamImageTransformer2DModel

xpu:0

torch.bfloat16

17105733184

2090

FrozenDict({'patch_size': 2, 'in_channels': 16, 'out_channels': 16, 'num_layers': 16, 'num_single_layers': 32, 'attention_head_dim': 128, 'num_attention_heads': 20, 'caption_channels': [4096, 4096], 'text_emb_dim': 2048, 'num_routed_experts': 4, 'num_activated_experts': 2, 'axes_dims_rope': [64, 32, 32], 'max_resolution': [128, 128], 'llama_layers': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31], 'force_inference_output': False, '_use_default_values': ['force_inference_output'], '_class_name': 'HiDreamImageTransformer2DModel', '_diffusers_version': '0.32.1', '_name_or_path': 'HiDream-ai/HiDream-I1-Dev'})

_name_or_path

str



0

0

None

_class_name

str



0

0

None

_diffusers_version

str



0

0

None


Speed of Dev compared with Full

prompt: car

HiDream-I1-Dev [5b3f48f0d6] FlowMatchLCMScheduler, 1024x1024HiDream-I1-Full [fe6156b63d] UniPCMultistepScheduler, 1024x1024

Image Modified

Image Added

Image Modified

Image Added

Time: 23m 2.63s
pipeline 1328.75 move 37.65 prompt 37.55 te 35.38 decode 16.19 offload 12.64
GPU 37726 MB 29% | RAM 58.53 GB 47%
Time: 40m 41.05s
pipeline 2391.21 move 32.94 prompt 32.89 te 30.65 decode 16.86 offload 14.00
GPU 37726 MB 29% | RAM 58.53 GB 47%
Time: 22m 37.61s
pipeline 1298.06 move 44.98 prompt 44.07 te 41.80 decode 14.53 offload 12.12
GPU 37724 MB 29% | RAM 58.57 GB 47%

Time: 39m 13.08s
pipeline 2315.96 move 22.51 prompt 22.46 te 20.28 decode 14.57 offload 12.04
GPU 37942 MB 30% | RAM 58.6 GB 47%

Steps: 28| Size: 512x512| Seed: 2423417735| CFG scale: 2|
Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013|
Operations: txt2img| Pipeline: HiDreamImagePipeline
Parameters: Steps: 50| Size: 1024x1024| Seed: 2423417735| CFG scale: 2| Model: HiDream-I1-Dev| App: SD.Next| Version: 72eb013| Operations: txt2img| Pipeline: HiDreamImagePipelineSteps: 28| Size: 512x512| Seed: 2423417735| CFG scale: 2|
Model: HiDream-I1-Full| App: SD.Next| Version: 72eb013|
Operations: txt2img| Pipeline: HiDreamImagePipeline

Steps: 50| Size: 1024x1024| Seed: 2423417735| CFG scale: 2|
Model: HiDream-I1-Full| App: SD.Next| Version: 72eb013|
Operations: txt2img| Pipeline: HiDreamImagePipeline