Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

1024px

CFG3.5, STEP 50Seed: 1620085323Seed:1931701040Seed:4075624134Seed:2736029172

bookshop girl

Image Modified

Image Modified

Image Modified

Image Modified

hand and face

Image Modified

Image Modified

Image Modified

Image Modified

legs and shoes

Image Modified

Image Modified

Image Modified

Image Modified

2048px

CFG3.5, STEP 50Seed: 1620085323Seed:1931701040Seed:4075624134Seed:2736029172

bookshop girl


Image Modified

Image Modified

Image Modified

Image Modified

hand and face

Image Modified

Image Modified

Image Modified

Image Added

legs and shoes

Image Added

Image Added

Image Added

Image Added


Test 1 - Bookshop

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

...

CFG1

...

CFG2

...

CFG3

...

CFG4

...

CFG5

...

CFG6

...

CFG8

Test 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

...

CFG1

...

CFG2

...

CFG3

...

CFG3.5

...

CFG4

...

CFG5

...

CFG8

Test 3 - Legs

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

...

CFG1

...

CFG2

...

CFG3

...

CFG3.5

...

CFG4

...

CFG5

...

CFG8

Test 4 - Other model Covers

1024px

...

Parameters: Steps: 32| Size: 2048x2048| Seed: 1931701040| CFG scale: 1.5| App: SD.Next| Version: 1aee3cc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Time: 54m 24.69s | total 3325.52 pipeline 3264.59 vae 17.23 offload 15.34 onload 14.21 te 8.61 callback 5.39 | GPU 52616 MB 41% | RAM 98.26 GB 78%



48163264

CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG8

Image Added

Image Added

Image Added

Image Added

Image Added


Test 2 - Face and hand

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.



8162032

CFG3

Image Added

Image Added

Image Added

Image Added

Test 3 - Legs

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.



8162032

CFG3

Image Added

Image Added

Image Added

Image Added

Test 4 - Other model Covers

512px

Image Added

1024px

Image AddedImage AddedImage AddedImage AddedImage AddedImage Added

2048px

Image AddedImage AddedImage AddedImage AddedImage AddedImage AddedImage AddedImage AddedImage AddedImage Added


Distilled vs non-Distilled

non-DistilledDistilled 

Image Added

Prompt: score_9, style_cluster_1679, professional photo, fashion portrait of a beautiful attractive gorgeous pretty sexy woman in a champagne thin silk V-Neck sleeveless backless dress, Instagram influencer photo-model face, intricate details of skin, natural breasts, luxury vibes, dramatic light, photorealism, advertising poster, blurry dim neon light "17" is on the background top left corner

Negative: score_1, score_2, score_3, score_4 anime, ugly, bad, wrong, weird, low quality, noisy, grainy, blurry, distorted, deformed, mutated, mutilated, plastic, smooth, text, signature, username, watermark

Parameters: Steps: 50| Size: 2048x2048| Seed: 426578498| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: score_9, style_cluster_1679, professional photo, fashion portrait of a beautiful attractive gorgeous pretty sexy woman in a champagne thin silk V-Neck sleeveless backless dress, Instagram influencer photo-model face, intricate details of skin, natural breasts, luxury vibes, dramatic light, photorealism, advertising poster, blurry dim neon light "17" is on the background top left corner

Negative: score_1, score_2, score_3, score_4 anime, ugly, bad, wrong, weird, low quality, noisy, grainy, blurry, distorted, deformed, mutated, mutilated, plastic, smooth, text, signature, username, watermark

Parameters: Steps: 8| Size: 2048x2048| Seed: 426578498| CFG scale: 3.25| CFG true: 1.4| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added

Prompt: Photorealistic, ultra-detailed marble reflections, 85mm lens bokeh, soft directional overhead light; empty after-hours office corridor with marble floors and frosted-glass walls; a Korean woman leaning against the wall, one leg bent so her bare foot presses on the cool stone; her high heels lie beside her, coat lifted to reveal her ankle and arch; she holds a stack of documents in one hand, the other teasingly lifting her coat hem; clandestine, charged tension in a corporate space; detailed toes fingers and face details

Negative: watermark, cartoon, extra limbs

Parameters: Steps: 50| Size: 2048x2048| Seed: 1866289189| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: Photorealistic, ultra-detailed marble reflections, 85mm lens bokeh, soft directional overhead light; empty after-hours office corridor with marble floors and frosted-glass walls; a Korean woman leaning against the wall, one leg bent so her bare foot presses on the cool stone; her high heels lie beside her, coat lifted to reveal her ankle and arch; she holds a stack of documents in one hand, the other teasingly lifting her coat hem; clandestine, charged tension in a corporate space; detailed toes fingers and face details

Negative: watermark, cartoon, extra limbs

Parameters: Steps: 16| Size: 2048x2048| Seed: 1866289189| CFG scale: 3.25| CFG true: 1.4| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added

Prompt: Sunlight streams through the arched roof, casting dramatic beams and long shadows across a vintage train station platform lined with ornate iron columns. A stationary dark-colored passenger train occupies the right track; its windows reflect light subtly. The left side features an empty tunnel entrance framed by stone walls. Figures: Two individuals stand near each other on the lower part of the frame. one standing upright holding something red (possibly luggage), facing away from the viewer towards distant tracks or signage marked "17B" in blue text above them under bright sunlight; another figure sits slightly bent forward next to some equipment or bags close behind this person's legs, head bowed as if resting hands together between knees while gazing downward at ground level where shadow meets sunlit area creating high contrast effect due to strong backlighting conditions. Style: Photorealistic illustration mimicking photographic depth but enhanced for artistic drama. Color Palette: Dominated by deep browns, blacks, grays contrasting sharply with golden-yellow warm tones illuminating upper portions especially around central beam of direct sunshine piercing darkness below. Lighting: Strong directional backlit rays create intense highlights on metallic surfaces like railings and pillars' tops while leaving bases cloaked entirely within shade enhancing three-dimensionality via stark luminosity gradients. Texture: Rough stonework visible along tunnels and building facades juxtaposed against smooth polished metalwork defining structural elements such as support posts which bear intricate decorative carvings atop their capitals adding historical architectural charm reminiscent classic European railway stations. Medium: Digital art rendered closely resembling hyper-detailed photography employing selective focus techniques similar portrait photography emphasizing foreground subjects amidst vast background expanse filled soft diffused glow emanating filtered daylight passing overhead arches forming dreamlike ethereal ambiance suggestive quiet contemplation solitude amid bustling transit hub momentarily paused time travel experience before journey commences mood evokes nostalgia tranquility anticipation transition life phases represented physical stillness motion implied awaiting departure arrival potential connections fleeting moments captured suspended narrative tension balance serenity melancholy wonder Stylistic keywords: Hyperrealism, chiaroscuro lighting, atmospheric perspective, textured architecture, cinematic composition, emotional storytelling, interplay of natural vs. artificial structures, timeless setting

Parameters: Steps: 50| Size: 2048x2048| Seed: 38| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: Sunlight streams through the arched roof, casting dramatic beams and long shadows across a vintage train station platform lined with ornate iron columns. A stationary dark-colored passenger train occupies the right track; its windows reflect light subtly. The left side features an empty tunnel entrance framed by stone walls. Figures: Two individuals stand near each other on the lower part of the frame. one standing upright holding something red (possibly luggage), facing away from the viewer towards distant tracks or signage marked "17B" in blue text above them under bright sunlight; another figure sits slightly bent forward next to some equipment or bags close behind this person's legs, head bowed as if resting hands together between knees while gazing downward at ground level where shadow meets sunlit area creating high contrast effect due to strong backlighting conditions. Style: Photorealistic illustration mimicking photographic depth but enhanced for artistic drama. Color Palette: Dominated by deep browns, blacks, grays contrasting sharply with golden-yellow warm tones illuminating upper portions especially around central beam of direct sunshine piercing darkness below. Lighting: Strong directional backlit rays create intense highlights on metallic surfaces like railings and pillars' tops while leaving bases cloaked entirely within shade enhancing three-dimensionality via stark luminosity gradients. Texture: Rough stonework visible along tunnels and building facades juxtaposed against smooth polished metalwork defining structural elements such as support posts which bear intricate decorative carvings atop their capitals adding historical architectural charm reminiscent classic European railway stations. Medium: Digital art rendered closely resembling hyper-detailed photography employing selective focus techniques similar portrait photography emphasizing foreground subjects amidst vast background expanse filled soft diffused glow emanating filtered daylight passing overhead arches forming dreamlike ethereal ambiance suggestive quiet contemplation solitude amid bustling transit hub momentarily paused time travel experience before journey commences mood evokes nostalgia tranquility anticipation transition life phases represented physical stillness motion implied awaiting departure arrival potential connections fleeting moments captured suspended narrative tension balance serenity melancholy wonder Stylistic keywords: Hyperrealism, chiaroscuro lighting, atmospheric perspective, textured architecture, cinematic composition, emotional storytelling, interplay of natural vs. artificial structures, timeless setting

Parameters: Steps: 16| Size: 2048x2048| Seed: 38| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

 Image Added

Prompt: Sunlight streams through the arched roof, casting dramatic beams and long shadows across a vintage train station platform lined with ornate iron columns. A stationary dark-colored passenger train occupies the right track; its windows reflect light subtly. The left side features an empty tunnel entrance framed by stone walls. Figures: Two individuals stand near each other on the lower part of the frame. one standing upright holding something red (possibly luggage), facing away from the viewer towards distant tracks or signage marked "17B" in blue text above them under bright sunlight; another figure sits slightly bent forward next to some equipment or bags close behind this person's legs, head bowed as if resting hands together between knees while gazing downward at ground level where shadow meets sunlit area creating high contrast effect due to strong backlighting conditions. Style: Photorealistic illustration mimicking photographic depth but enhanced for artistic drama. Color Palette: Dominated by deep browns, blacks, grays contrasting sharply with golden-yellow warm tones illuminating upper portions especially around central beam of direct sunshine piercing darkness below. Lighting: Strong directional backlit rays create intense highlights on metallic surfaces like railings and pillars' tops while leaving bases cloaked entirely within shade enhancing three-dimensionality via stark luminosity gradients. Texture: Rough stonework visible along tunnels and building facades juxtaposed against smooth polished metalwork defining structural elements such as support posts which bear intricate decorative carvings atop their capitals adding historical architectural charm reminiscent classic European railway stations. Medium: Digital art rendered closely resembling hyper-detailed photography employing selective focus techniques similar portrait photography emphasizing foreground subjects amidst vast background expanse filled soft diffused glow emanating filtered daylight passing overhead arches forming dreamlike ethereal ambiance suggestive quiet contemplation solitude amid bustling transit hub momentarily paused time travel experience before journey commences mood evokes nostalgia tranquility anticipation transition life phases represented physical stillness motion implied awaiting departure arrival potential connections fleeting moments captured suspended narrative tension balance serenity melancholy wonder Stylistic keywords: Hyperrealism, chiaroscuro lighting, atmospheric perspective, textured architecture, cinematic composition, emotional storytelling, interplay of natural vs. artificial structures, timeless setting

Parameters: Steps: 50| Size: 2048x2048| Seed: 2025| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: Sunlight streams through the arched roof, casting dramatic beams and long shadows across a vintage train station platform lined with ornate iron columns. A stationary dark-colored passenger train occupies the right track; its windows reflect light subtly. The left side features an empty tunnel entrance framed by stone walls. Figures: Two individuals stand near each other on the lower part of the frame. one standing upright holding something red (possibly luggage), facing away from the viewer towards distant tracks or signage marked "17B" in blue text above them under bright sunlight; another figure sits slightly bent forward next to some equipment or bags close behind this person's legs, head bowed as if resting hands together between knees while gazing downward at ground level where shadow meets sunlit area creating high contrast effect due to strong backlighting conditions. Style: Photorealistic illustration mimicking photographic depth but enhanced for artistic drama. Color Palette: Dominated by deep browns, blacks, grays contrasting sharply with golden-yellow warm tones illuminating upper portions especially around central beam of direct sunshine piercing darkness below. Lighting: Strong directional backlit rays create intense highlights on metallic surfaces like railings and pillars' tops while leaving bases cloaked entirely within shade enhancing three-dimensionality via stark luminosity gradients. Texture: Rough stonework visible along tunnels and building facades juxtaposed against smooth polished metalwork defining structural elements such as support posts which bear intricate decorative carvings atop their capitals adding historical architectural charm reminiscent classic European railway stations. Medium: Digital art rendered closely resembling hyper-detailed photography employing selective focus techniques similar portrait photography emphasizing foreground subjects amidst vast background expanse filled soft diffused glow emanating filtered daylight passing overhead arches forming dreamlike ethereal ambiance suggestive quiet contemplation solitude amid bustling transit hub momentarily paused time travel experience before journey commences mood evokes nostalgia tranquility anticipation transition life phases represented physical stillness motion implied awaiting departure arrival potential connections fleeting moments captured suspended narrative tension balance serenity melancholy wonder Stylistic keywords: Hyperrealism, chiaroscuro lighting, atmospheric perspective, textured architecture, cinematic composition, emotional storytelling, interplay of natural vs. artificial structures, timeless setting

Parameters: Steps: 16| Size: 2048x2048| Seed: 2025| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added

Prompt: A surreal, artistic portrait of a serene young woman with closed eyes, surrounded by an explosion of colorful powder and paint splashes. The composition has a dreamlike, ethereal atmosphere with vivid bursts of neon blue, green, orange, pink, and yellow around her face and hair. Her expression is peaceful, as if meditating or lost in thought. The background is dark, almost black, which makes the vibrant colors stand out dramatically. Above her head, the word “HUNYUAN” appears in modern, minimalist typography. Cinematic lighting, ultra-detailed, high resolution, digital art style.

Parameters: Steps: 50| Size: 2048x2048| Seed: 42| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: A surreal, artistic portrait of a serene young woman with closed eyes, surrounded by an explosion of colorful powder and paint splashes. The composition has a dreamlike, ethereal atmosphere with vivid bursts of neon blue, green, orange, pink, and yellow around her face and hair. Her expression is peaceful, as if meditating or lost in thought. The background is dark, almost black, which makes the vibrant colors stand out dramatically. Above her head, the word “HUNYUAN” appears in modern, minimalist typography. Cinematic lighting, ultra-detailed, high resolution, digital art style.

Parameters: Steps: 8| Size: 2048x2048| Seed: 42| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added


Prompt: nsfw scene from sin city, photorealistic wet sexy young woman holding umbrella and walking in the city square, long hair, transparent wet shirt, mini skirt, buildings with neon signs, during night, visible raindrops falling, reflections on the floor, perfect face, extreme detailed, dslr, leika

Parameters: Steps: 50| Size: 2048x2048| Seed: 42| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added


Prompt: nsfw scene from sin city, photorealistic wet sexy young woman holding umbrella and walking in the city square, long hair, transparent wet shirt, mini skirt, buildings with neon signs, during night, visible raindrops falling, reflections on the floor, perfect face, extreme detailed, dslr, leika

Parameters: Steps: 8| Size: 2048x2048| Seed: 42| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added

Prompt: A close-up scene showing the text "Hunyuan Image 2.1" meticulously formed by a womans cherry lipstick on the outer surface of a rainy windowpane. The letters are also framed using individual raindrops that gather and slide along the glass, forming clear, sharp characters with slight reflections. The woman’s subtle touch a corner of the window glass with one hand, her face is partially blurred at the edges, visible only faintly through the wet glass - her features soft and indistinct, eyes downcast, hair wet, damp and clinging to her shoulders. Behind the window, a dense urban skyline stretches into the distance, featuring towering skyscrapers with reflective surfaces catching the dim twilight glow; fog curls around some buildings, adding depth and atmospheric haze. In the upper-left corner of the composition, written in loose, uneven cursive handwriting in faded blue ink, the number 17B appears subtly beneath the edge of the frame. Rain streaks cascade diagonally across the glass, enhancing texture and motion, creating a sense of quiet intensity and isolation within the city.

Parameters: Steps: 50| Size: 2048x2048| Seed: 1873624607| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added

Prompt: A close-up scene showing the text "Hunyuan Image 2.1" meticulously formed by a womans cherry lipstick on the outer surface of a rainy windowpane. The letters are also framed using individual raindrops that gather and slide along the glass, forming clear, sharp characters with slight reflections. The woman’s subtle touch a corner of the window glass with one hand, her face is partially blurred at the edges, visible only faintly through the wet glass - her features soft and indistinct, eyes downcast, hair wet, damp and clinging to her shoulders. Behind the window, a dense urban skyline stretches into the distance, featuring towering skyscrapers with reflective surfaces catching the dim twilight glow; fog curls around some buildings, adding depth and atmospheric haze. In the upper-left corner of the composition, written in loose, uneven cursive handwriting in faded blue ink, the number 17B appears subtly beneath the edge of the frame. Rain streaks cascade diagonally across the glass, enhancing texture and motion, creating a sense of quiet intensity and isolation within the city.

Parameters: Steps: 8| Size: 2048x2048| Seed: 1873624607| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers

Image Added


Prompt: Female Age 40, Model sits perched at the edge of an antique bench, one foot on the floor while the other leg rests along the bench, loose silk slip dress draping naturally, light casting romantic shadows.. Face Shape: soft oval. Skin Tone: pale with slight rosiness. Eyes: intense green. Hair: curly brunette. Body Type: curvy yet athletic.

Parameters: Steps: 50| Size: 2048x2048| Seed: 42| CFG scale: 3.5| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Diffusers

Image Added


Prompt: Female Age 40, Model sits perched at the edge of an antique bench, one foot on the floor while the other leg rests along the bench, loose silk slip dress draping naturally, light casting romantic shadows.. Face Shape: soft oval. Skin Tone: pale with slight rosiness. Eyes: intense green. Hair: curly brunette. Body Type: curvy yet athletic.

Parameters: Steps: 8| Size: 2048x2048| Seed: 42| CFG scale: 3.25| App: SD.Next| Version: ded5afc| Pipeline: HunyuanImagePipeline| Operations: txt2img| Model: HunyuanImage-2.1-Distilled-Diffusers


System info


Code Block
Sat Oct 25 12:53:29 2025
app: sdnext.git updated: 2025-10-24 hash: 88ac83839 url: https://github.com/liutyi/sdnext.git/tree/pytorch
arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-33-generic
python: 3.12.3 python: 3.12.3 Torch: 2.9.0+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex: 
ram: free:119.7 used:5.63 total:125.33
xformers: diffusers: 0.36.0.dev0 transformers: 4.57.1
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: Diffusers/hunyuanvideo-community/HunyuanImage-2.1-Diffusers [7e7b7a177d] refiner: none vae: none te: none unet: none
Backend: ipex Pipeline: native Memory optimization: none Cross-attention: Scaled-Dot-Product

...

Code Block
  "huggingface_token": "hf_..FraU",
  "diffusers_version": "7536f647e4144c7acaf9e140893ff7edb85bf9a3",
  "sd_model_checkpoint": "hunyuanvideo-community/HunyuanImage-2.1-Diffusers",
  "sd_checkpoint_hash": null,
  "diffusers_to_gpu": true,
  "device_map": "gpu",
  "model_wan_stage": "combined",
  "diffusers_offload_mode": "none",
  "ui_request_timeout": 300000,
  "show_progress_type": "Simple"


Model info

Diffusers/lodestones/Chroma1-HD [ca9e916cebhunyuanvideo-community/HunyuanImage-2.1-Diffusers [7e7b7a177d]

ModuleClassDeviceDtypeQuantParamsModulesConfig
vaeAutoencoderKLHunyuanImagecputorch.bfloat16None405575491405575491255

FrozenDict({'in_channels': 3, 'out_channels': 3, 'latent_channels': 64, 'block_out_channels': [128, 256, 512, 512, 1024, 1024], 'layers_per_block': 2, 'spatial_compression_ratio': 32, 'sample_size': 384, 'scaling_factor': 0.75289, 'downsample_match_channel': True, 'upsample_match_channel': True, '_class_name': 'AutoencoderKLHunyuanImage', '_diffusers_version': '0.36.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--hunyuanvideo-community--HunyuanImage-2.1-Diffusers/snapshots/7e7b7a177de58591aeaffca0929f4765003d7ced/vae'})

text_encoderQwen2_5_VLForConditionalGenerationxpu:0torch.bfloat16None82921666568292166656763

Qwen2_5_VLConfig { "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "text_config": { "_name_or_path": "hunyuanvideo-community/HunyuanImage-2.1-Diffusers", "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "layer_types": [ "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention" ], "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl_text", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": null, "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }, "tie_word_embeddings": false, "transformers_version": "4.57.1", "use_cache": true, "use_sliding_window": false, "vision_config": { "depth": 32, "dtype": "bfloat16", "fullatt_block_indexes": [ 7, 15, 23, 31 ], "hidden_act": "silu", "hidden_size": 1280, "in_channels": 3, "in_chans": 3, "initializer_range": 0.02, "intermediate_size": 3420, "model_type": "qwen2_5_vl", "num_heads": 16, "out_hidden_size": 3584, "patch_size": 14, "spatial_merge_size": 2, "spatial_patch_size": 14, "temporal_patch_size": 2, "tokens_per_second": 2, "window_size": 112 }, "vision_token_id": 151654, "vocab_size": 152064 }

tokenizerQwen2TokenizerNoneNoneNone00

None

text_encoder_2T5EncoderModelxpu:0torch.bfloat16None219314944219314944235

T5Config { "architectures": [ "T5EncoderModel" ], "classifier_dropout": 0.0, "d_ff": 3584, "d_kv": 64, "d_model": 1472, "decoder_start_token_id": 0, "dense_act_fn": "gelu_new", "dropout_rate": 0.1, "dtype": "bfloat16", "eos_token_id": 1, "feed_forward_proj": "gated-gelu", "gradient_checkpointing": false, "initializer_factor": 1.0, "is_encoder_decoder": false, "is_gated_act": true, "layer_norm_epsilon": 1e-06, "model_type": "t5", "num_decoder_layers": 4, "num_heads": 6, "num_layers": 12, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "tie_word_embeddings": false, "tokenizer_class": "ByT5Tokenizer", "transformers_version": "4.57.1", "use_cache": false, "vocab_size": 1510 }

tokenizer_2ByT5TokenizerNoneNoneNone00

None

transformerHunyuanImageTransformer2DModelHunyuanImageTransformer2DModelxpu:0torch.bfloat16None17425795520174257955201397

FrozenDict({'in_channels': 64, 'out_channels': 64, 'num_attention_heads': 28, 'attention_head_dim': 128, 'num_layers': 20, 'num_single_layers': 40, 'num_refiner_layers': 2, 'mlp_ratio': 4.0, 'patch_size': [1, 1], 'qk_norm': 'rms_norm', 'guidance_embeds': False, 'text_embed_dim': 3584, 'text_embed_2_dim': 1472, 'rope_theta': 256.0, 'rope_axes_dim': [64, 64], 'use_meanflow': False, '_use_default_values': ['use_meanflow'], '_class_name': 'HunyuanImageTransformer2DModel', '_diffusers_version': '0.36.0.dev0', '_name_or_path': 'hunyuanvideo-community/HunyuanImage-2.1-Diffusers'})

schedulerFlowMatchEulerDiscreteSchedulerFlowMatchEulerDiscreteSchedulerNoneNoneNone00

FrozenDict({'num_train_timesteps': 1000, 'shift': 5.0, 'use_dynamic_shifting': False, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'stochastic_sampling': False, '_class_name': 'FlowMatchEulerDiscreteScheduler', '_diffusers_version': '0.36.0.dev0'})

guiderAdaptiveProjectedMixGuidanceNoneNoneNone00

FrozenDict({'guidance_scale': 3.5, 'guidance_rescale': 0.0, 'adaptive_projected_guidance_scale': 10.0, 'adaptive_projected_guidance_momentum': -0.5, 'adaptive_projected_guidance_rescale': 10.0, 'eta': 0.0, 'use_original_formulation': False, 'start': 0.0, 'stop': 1.0, 'adaptive_projected_guidance_start_step': 5, 'enabled': True, '_class_name': 'AdaptiveProjectedMixGuidance', '_diffusers_version': '0.36.0.dev0'})

ocr_guiderAdaptiveProjectedMixGuidanceNoneNoneNone00

FrozenDict({'guidance_scale': 3.5, 'guidance_rescale': 0.0, 'adaptive_projected_guidance_scale': 10.0, 'adaptive_projected_guidance_momentum': -0.5, 'adaptive_projected_guidance_rescale': 10.0, 'eta': 0.0, 'use_original_formulation': False, 'start': 0.0, 'stop': 1.0, 'adaptive_projected_guidance_start_step': 38, 'enabled': True, '_class_name': 'AdaptiveProjectedMixGuidance', '_diffusers_version': '0.36.0.dev0'})

...