...
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 32| Size: 1024x1024| Seed:1931701040| CFG scale: 4| App: SD.Next| Version: e7bd067| Pipeline: LongCatImagePipeline| Operations: txt2img| Model: LongCat-Image
285H Time: 16m 59.25s | total 1019.55 pipeline 1017.69 decode 1.53 gc 0.29 | GPU 31792 MB 25% | RAM 49.0 GB 40%
| 4 | 8 | 16 | 32 | 64 | |
|---|---|---|---|---|---|
| CFG1 | |||||
| CFG2 | |||||
| CFG3 | |||||
| CFG4 | |||||
| CFG5 | |||||
| CFG6 | |||||
| CFG8 |
...
Prompts are in Test 42 - All models cover image
Test 6 - Art Prompts
Test 7 - Finding the CoverÂ
small mirror
mirror wall graffity
neon sign
System info
| Code Block |
|---|
Fri Dec 19 06:59:12 2025 app: sdnext.git updated: 2025-12-16 hash: c53ebcac5 url: https://github.com/liutyi/sdnext/tree/pytorch arch: x86_64 cpu: x86_64 system: Linux release: 6.17.0-8-generic python: 3.12.3 PyTorch 2.9.1+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: ram: free:118.03 used:5.05 total:123.07 xformers: diffusers: 0.36.0.dev0 transformers: 4.57.3 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: meituan-longcat/LongCat-Image refiner: none vae: none te: none unet: none ipex native none Scaled-Dot-Product |
...
meituan-longcat/LongCat-Image
| Module | Class | Device | Dtype | Quant | Params | Modules | Config |
|---|---|---|---|---|---|---|---|
| vae | AutoencoderKL | xpu:0 | torch.bfloat16 | None | 83819683 | 241 | FrozenDict({'in_channels': 3, 'out_channels': 3, 'down_block_types': ['DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D', 'DownEncoderBlock2D'], 'up_block_types': ['UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D', 'UpDecoderBlock2D'], 'block_out_channels': [128, 256, 512, 512], 'layers_per_block': 2, 'act_fn': 'silu', 'latent_channels': 16, 'norm_num_groups': 32, 'sample_size': 1024, 'scaling_factor': 0.3611, 'shift_factor': 0.1159, 'latents_mean': None, 'latents_std': None, 'force_upcast': True, 'use_quant_conv': False, 'use_post_quant_conv': False, 'mid_block_add_attention': True, '_class_name': 'AutoencoderKL', '_diffusers_version': '0.30.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--meituan-longcat--LongCat-Image/snapshots/d2ea50b79a930074c37b9b97ce45e3b2ea8cf4d8/vae'}) |
| text_encoder | Qwen2_5_VLForConditionalGeneration | xpu:0 | torch.bfloat16 | None | 8292166656 | 763 | Qwen2_5_VLConfig { "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "text_config": { "_name_or_path": "hunyuanvideo-community/HunyuanImage-2.1-Diffusers", "architectures": [ "Qwen2_5_VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "dtype": "bfloat16", "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "layer_types": [ "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention" ], "max_position_embeddings": 128000, "max_window_layers": 28, "model_type": "qwen2_5_vl_text", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": null, "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }, "tie_word_embeddings": false, "transformers_version": "4.57.3", "use_cache": true, "use_sliding_window": false, "vision_config": { "depth": 32, "dtype": "bfloat16", "fullatt_block_indexes": [ 7, 15, 23, 31 ], "hidden_act": "silu", "hidden_size": 1280, "in_channels": 3, "in_chans": 3, "initializer_range": 0.02, "intermediate_size": 3420, "model_type": "qwen2_5_vl", "num_heads": 16, "out_hidden_size": 3584, "patch_size": 14, "spatial_merge_size": 2, "spatial_patch_size": 14, "temporal_patch_size": 2, "tokens_per_second": 2, "window_size": 112 }, "vision_token_id": 151654, "vocab_size": 152064 } |
| tokenizer | Qwen2Tokenizer | None | None | None | 0 | 0 | None |
| transformer | LongCatImageTransformer2DModel | xpu:0 | torch.bfloat16 | None | 6270668864 | 677 | FrozenDict({'patch_size': 1, 'in_channels': 64, 'num_layers': 10, 'num_single_layers': 20, 'attention_head_dim': 128, 'num_attention_heads': 24, 'joint_attention_dim': 3584, 'pooled_projection_dim': 3584, 'axes_dims_rope': [16, 56, 56], '_use_default_values': ['axes_dims_rope'], '_class_name': 'LongCatImageTransformer2DModel', '_diffusers_version': '0.30.0.dev0', 'guidance_embeds': False, '_name_or_path': 'meituan-longcat/LongCat-Image'}) |
| scheduler | FlowMatchEulerDiscreteScheduler | None | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'shift': 3.0, 'use_dynamic_shifting': True, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'stochastic_sampling': False, '_use_default_values': ['shift_terminal', 'stochastic_sampling', 'time_shift_type', 'use_exponential_sigmas', 'use_karras_sigmas', 'use_beta_sigmas', 'invert_sigmas'], '_class_name': 'FlowMatchEulerDiscreteScheduler', '_diffusers_version': '0.30.0.dev0'}) |
| text_processor | Qwen2VLProcessor | None | None | None | 0 | 0 | None |