Model Info and links

https://huggingface.co/black-forest-labs/FLUX.2-klein-base-4B

import torch
from diffusers import Flux2KleinPipeline

device = "cuda"
dtype = torch.bfloat16

pipe = Flux2KleinPipeline.from_pretrained("black-forest-labs/FLUX.2-klein-base-4B", torch_dtype=dtype)
pipe.enable_model_cpu_offload()  # save some VRAM by offloading the model to CPU

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=4.0,
    num_inference_steps=50,
    generator=torch.Generator(device=device).manual_seed(0)
).images[0]
image.save("flux-klein.png")


Test 0 - Seed and guidance

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Parameters: Steps: 50| Size: 1024x1024| Scheduler: FlowMatchEulerDiscreteScheduler| Seed: 2736029172| CFG scale: 4| App: SD.Next| Version: c95ff3d| Pipeline: Flux2KleinPipeline| Operations: txt2img| Model: FLUX.2-klein-base-4B

285H Time: 6m 32.89s | total 715.32 pipeline 392.86 preview 227.32 callback 95.10 | GPU 20062 MB 16% | RAM 26.75 GB 22%

CFG4, STEP50Seed: 1620085323Seed:1931701040Seed:4075624134Seed:2736029172
Bookshop girl

Face and hand

Legs and shoes

Test 1 - Bookstore

Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling

Parameters: Steps: 32| Size: 1024x1024| Scheduler: FlowMatchEulerDiscreteScheduler| Seed: 1931701040| CFG scale: 6| App: SD.Next| Version: c95ff3d| Pipeline: Flux2KleinPipeline| Operations: txt2img| Model: FLUX.2-klein-base-4B

285H Time: 4m 12.24s | total 428.93 pipeline 252.21 preview 121.19 callback 55.50 | GPU 19696 MB 16% | RAM 29.23 GB 24%



8163264
CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG8

Test 2 - Face and hands

Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.

Parameters: Steps: 16| Size: 1024x1024| Scheduler: FlowMatchEulerDiscreteScheduler| Seed: 1620085323| CFG scale: 1.0| App: SD.Next| Version: c95ff3d| Pipeline: Flux2KleinPipeline| Operations: txt2img| Model: FLUX.2-klein-base-4B

285H Time: 1m 3.77s | total 136.09 pipeline 63.73 callback 61.13 preview 10.65 vae 0.54 | GPU 19696 MB 16% | RAM 29.51 GB 24%


8163264
CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG8

Test 3 - Legs

Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.

Parameters: Steps: 64| Size: 1024x1024| Scheduler: FlowMatchEulerDiscreteScheduler| Seed: 1931701040| CFG scale: 4| App: SD.Next| Version: c95ff3d| Pipeline: Flux2KleinPipeline| Operations: txt2img| Model: FLUX.2-klein-base-4B

285H Time: 8m 15.91s | total 828.99 pipeline 495.88 callback 308.33 preview 24.74 | GPU 22654 MB 18% | RAM 29.42 GB 24%


8163264
CFG1

CFG2

CFG3

CFG4

CFG5

CFG6

CFG8

Test 4 - Other model covers

Test 5 - Other prompts


Test 6 - Optional find the cover


Test 7 - Empty prompts


seed:1seed:2seed:3seed:4seed:5

seed:6seed:7seed:8seed:9seed:10

seed:21seed:42seed:68seed:324seed:2026


System Info

Wed Feb  4 19:15:39 2026
app: sdnext.git updated: 2026-02-02 hash: c7ecba67c tag:  tags:  url: https://github.com/liutyi/sdnext/tree/pytorch
arch: x86_64 cpu: x86_64 system: Linux release: 6.14.0-37-generic 
python: 3.12.3 PyTorch: 2.10.0+xpu
device: Intel(R) Arc(TM) Graphics (1) ipex: 
ram: free:51.66 used:10.67 total:62.33
xformers: diffusers: 0.37.0.dev0 transformers: 4.57.5
active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16
base: black-forest-labs/FLUX.2-klein-base-4B refiner: none  vae: none te: none unet: none


App config

.


Model metadata

black-forest-labs/FLUX.2-klein-base-4B

ModuleClassDeviceDtypeQuantParamsModulesConfig
vaeAutoencoderKLFlux2xpu:0torch.bfloat16None84046115244

FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 32, 'norm_num_groups': 32, 'sample_size': 1024, 'force_upcast': True, 'use_quant_conv': True, 'use_post_quant_conv': True, 'mid_block_add_attention': True, 'batch_norm_eps': 0.0001, 'batch_norm_momentum': 0.1, 'patch_size': [2, 2], '_class_name': 'AutoencoderKLFlux2', '_diffusers_version': '0.37.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--black-forest-labs--FLUX.2-klein-base-4B/snapshots/8c44a2fbef88fae175da65df054db0f901aa9747/vae'})

text_encoderQwen3ForCausalLMxpu:0torch.bfloat16None4022468096547

Qwen3Config { "architectures": [ "Qwen3ForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "head_dim": 128, "hidden_act": "silu", "hidden_size": 2560, "initializer_range": 0.02, "intermediate_size": 9728, "layer_types": [ "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention" ], "max_position_embeddings": 40960, "max_window_layers": 36, "model_type": "qwen3", "num_attention_heads": 32, "num_hidden_layers": 36, "num_key_value_heads": 8, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 1000000, "sliding_window": null, "tie_word_embeddings": true, "transformers_version": "4.57.5", "use_cache": true, "use_sliding_window": false, "vocab_size": 151936 }

tokenizerQwen2TokenizerFastNoneNoneNone00

None

schedulerFlowMatchEulerDiscreteSchedulerNoneNoneNone00

FrozenDict({'num_train_timesteps': 1000, 'shift': 3.0, 'use_dynamic_shifting': True, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'stochastic_sampling': False, '_class_name': 'FlowMatchEulerDiscreteScheduler', '_diffusers_version': '0.37.0.dev0'})

transformerFlux2Transformer2DModelxpu:0torch.bfloat16None3875544576356

FrozenDict({'patch_size': 1, 'in_channels': 128, 'out_channels': None, 'num_layers': 5, 'num_single_layers': 20, 'attention_head_dim': 128, 'num_attention_heads': 24, 'joint_attention_dim': 7680, 'timestep_guidance_channels': 256, 'mlp_ratio': 3.0, 'axes_dims_rope': [32, 32, 32, 32], 'rope_theta': 2000, 'eps': 1e-06, 'guidance_embeds': False, '_class_name': 'Flux2Transformer2DModel', '_diffusers_version': '0.37.0.dev0', '_name_or_path': 'black-forest-labs/FLUX.2-klein-base-4B'})

is_distilledboolNoneNoneNone00

None

Schedule

FLUX.2-klein-base-4B-model-test.json

  • No labels