Model Info and links
https://huggingface.co/circlestone-labs/Anima
Test 0 - Seed and guidance
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
| CFG4.5, STEP50 | Seed: 1620085323 | Seed:1931701040 | Seed:4075624134 | Seed:2736029172 |
|---|---|---|---|---|
| Bookshop girl | ||||
| Face and hand | ||||
| Legs and shoes |
Test 1 - Bookstore
Prompt: photorealistic girl in bookshop choosing the book in romantic stories shelf. smiling
Parameters: Steps: 30| Size: 1024x1024| Seed: 1620085323| CFG scale: 4| App: SD.Next| Version: c7ecba6| Pipeline: AnimaTextToImagePipeline| Operations: txt2img| Model: Anima-sdnext-diffusers
285H Time: 2m 26.69s | total 151.26 pipeline 146.65 preview 3.29 callback 0.98 | GPU 10654 MB 8% | RAM 22.38 GB 18%
| 5 | 10 | 20 | 30 | 40 | 50 | 100 | |
|---|---|---|---|---|---|---|---|
| CFG1 | |||||||
| CFG2 | |||||||
| CFG3 | |||||||
| CFG4 | |||||||
| CFG5 | |||||||
| CFG6 | |||||||
| CFG8 |
Test 2 - Face and hands
Prompt: Create a close-up photograph of a woman's face and hand, with her hand raised to her chin. She is wearing a white blazer and has a gold ring on her finger. Her nails are neatly manicured and her hair is pulled back into a low bun. She is smiling and has a radiant expression on her face. The background is a plain light gray color. The overall mood of the photo is elegant and sophisticated. The photo should have a soft, natural light and a slight warmth to it. The woman's hair is dark brown and pulled back into a low bun, with a few loose strands framing her face.
Parameters: Steps: 32| Size: 1024x1024| Seed: 4075624134| CFG scale: 4| App: SD.Next| Version: c7ecba6| Pipeline: AnimaTextToImagePipeline| Operations: txt2img| Model: Anima-sdnext-diffusers
285H Time: 2m 40.12s | total 170.54 pipeline 160.07 preview 9.08 callback 1.06 | GPU 10654 MB 8% | RAM 22.48 GB 18%
| 8 | 16 | 32 | 64 | |
|---|---|---|---|---|
| CFG4 | ||||
| CFG4.5 | ||||
| CFG4.75 | ||||
| CFG5 | ||||
| CFG6 |
Test 3 - Legs
Prompt: Generate a photo of a woman's legs, with her feet crossed and wearing white high-heeled shoes with ribbons tied around her ankles. The shoes should have a pointed toe and a stiletto heel. The woman's legs should be smooth and tanned, with a slight sheen to them. The background should be a light gray color. The photo should be taken from a low angle, looking up at the woman's legs. The ribbons should be tied in a bow shape around the ankles. The shoes should have a red sole. The woman's legs should be slightly bent at the knee.
| 8 | 16 | 32 | 64 | |
|---|---|---|---|---|
| CFG3 | ||||
| CFG4 | ||||
| CFG4.5 | ||||
| CFG5 | ||||
| CFG5.5 | ||||
| CFG6 |
Test 4 - Other model covers
Test 5 - Other prompts
Test 6 - Optional find the cover
Test 7 - Empty prompts
| seed:1 | seed:2 | seed:3 | seed:4 | seed:5 |
|---|---|---|---|---|
| seed:6 | seed:7 | seed:8 | seed:9 | seed:10 |
| seed:21 | seed:42 | seed:68 | seed:324 | seed:2026 |
Test 8 - Compare to Cosmos Predict 2B
| Cosmos | Anima |
|---|---|
Prompt: masterpiece, best quality, score_7, safe. An anime girl wearing a black tank-top and denim shorts is standing outdoors. She's holding a rectangular sign out in front of her that reads "ANIMA" in solid white text. She's looking at the viewer with a smile. The background features some trees and blue sky with clouds. In the top left corner there is "2B" written in white drippy text. Negative: out of frame, cropped Parameters: Steps: 30| Size: 1024x1024| Seed: 3389420977| CFG scale: 4| App: SD.Next| Version: 5b0f86b| Pipeline: Cosmos2TextToImagePipeline| Operations: txt2img| Model: Cosmos-Predict2-2B-Text2Image Time: 2m 30.45s | total 405.70 pipeline 150.40 preview 125.88 callback 124.10 vae 3.16 te 2.11 | GPU 18516 MB 15% | RAM 26.01 GB 21% | |
System Info
Tue Feb 3 12:50:13 2026 Backend: ipex Pipeline: native Memory optimization: none Cross-attention: Scaled-Dot-Product app: sdnext.git updated: 2026-02-02 hash: c7ecba67c tag: tags: url: https://github.com/liutyi/sdnext/tree/pytorch arch: x86_64 cpu: x86_64 system: Linux release: 6.17.0-8-generic python: 3.12.3 Pytorch: 2.10.0+xpu device: Intel(R) Arc(TM) Graphics (1) ipex: ram: free:112.69 used:10.38 total:123.07 xformers: diffusers: 0.37.0.dev0 transformers: 4.57.5 active: xpu dtype: torch.bfloat16 vae: torch.bfloat16 unet: torch.bfloat16 base: CalamitousFelicitousness/Anima-sdnext-diffusers refiner: none vae: none te: none unet: none ipex native none Scaled-Dot-Product
App config
.
Model metadata
CalamitousFelicitousness/Anima-sdnext-diffusers
| Module | Class | Device | Dtype | Quant | Params | Modules | Config |
|---|---|---|---|---|---|---|---|
| text_encoder | Qwen3Model | xpu:0 | torch.bfloat16 | None | 596049920 | 425 | Qwen3Config { "architectures": [ "Qwen3Model" ], "attention_bias": false, "attention_dropout": 0.0, "dtype": "bfloat16", "head_dim": 128, "hidden_act": "silu", "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 3072, "layer_types": [ "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention", "full_attention" ], "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen3", "num_attention_heads": 16, "num_hidden_layers": 28, "num_key_value_heads": 8, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 1000000.0, "sliding_window": null, "tie_word_embeddings": false, "transformers_version": "4.57.5", "use_cache": false, "use_sliding_window": false, "vocab_size": 151936 } |
| tokenizer | Qwen2TokenizerFast | None | None | None | 0 | 0 | None |
| t5_tokenizer | T5TokenizerFast | None | None | None | 0 | 0 | None |
| llm_adapter | AnimaLLMAdapter | xpu:0 | torch.bfloat16 | None | 134663680 | 139 | FrozenDict({'source_dim': 1024, 'target_dim': 1024, 'model_dim': 1024, 'num_layers': 6, 'num_heads': 16, 'mlp_ratio': 4.0, 'vocab_size': 32128, 'use_self_attn': True, '_class_name': 'AnimaLLMAdapter', '_diffusers_version': '0.37.0', '_name_or_path': 'CalamitousFelicitousness/Anima-sdnext-diffusers'}) |
| transformer | CosmosTransformer3DModel | xpu:0 | torch.bfloat16 | None | 1956405248 | 1138 | FrozenDict({'in_channels': 16, 'out_channels': 16, 'num_attention_heads': 16, 'attention_head_dim': 128, 'num_layers': 28, 'mlp_ratio': 4.0, 'text_embed_dim': 1024, 'adaln_lora_dim': 256, 'max_size': [128, 240, 240], 'patch_size': [1, 2, 2], 'rope_scale': [1.0, 4.0, 4.0], 'concat_padding_mask': True, 'extra_pos_embed_type': None, 'use_crossattn_projection': False, 'crossattn_proj_in_channels': 1024, 'encoder_hidden_states_channels': 1024, '_use_default_values': ['use_crossattn_projection', 'crossattn_proj_in_channels', 'encoder_hidden_states_channels'], '_class_name': 'CosmosTransformer3DModel', '_diffusers_version': '0.37.0', '_name_or_path': 'CalamitousFelicitousness/Anima-sdnext-diffusers'}) |
| vae | AutoencoderKLWan | xpu:0 | torch.bfloat16 | None | 126892531 | 260 | FrozenDict({'base_dim': 96, 'decoder_base_dim': None, 'z_dim': 16, 'dim_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_scales': [], 'temperal_downsample': [False, True, True], 'dropout': 0.0, 'latents_mean': [-0.7571, -0.7089, -0.9113, 0.1075, -0.1745, 0.9653, -0.1517, 1.5508, 0.4134, -0.0715, 0.5517, -0.3632, -0.1922, -0.9497, 0.2503, -0.2921], 'latents_std': [2.8184, 1.4541, 2.3275, 2.6558, 1.2196, 1.7708, 2.6052, 2.0743, 3.2687, 2.1526, 2.8652, 1.5579, 1.6382, 1.1253, 2.8251, 1.916], 'is_residual': False, 'in_channels': 3, 'out_channels': 3, 'patch_size': None, 'scale_factor_temporal': 4, 'scale_factor_spatial': 8, '_use_default_values': ['is_residual', 'decoder_base_dim', 'in_channels', 'patch_size', 'out_channels', 'scale_factor_spatial', 'scale_factor_temporal'], '_class_name': 'AutoencoderKLWan', '_diffusers_version': '0.33.0.dev0', '_name_or_path': '/mnt/models/Diffusers/models--CalamitousFelicitousness--Anima-sdnext-diffusers/snapshots/587e3941c37ace6234f9c0daa5c908408652870a/vae'}) |
| scheduler | FlowMatchEulerDiscreteScheduler | None | None | None | 0 | 0 | FrozenDict({'num_train_timesteps': 1000, 'shift': 3.0, 'use_dynamic_shifting': False, 'base_shift': 0.5, 'max_shift': 1.15, 'base_image_seq_len': 256, 'max_image_seq_len': 4096, 'invert_sigmas': False, 'shift_terminal': None, 'use_karras_sigmas': False, 'use_exponential_sigmas': False, 'use_beta_sigmas': False, 'time_shift_type': 'exponential', 'stochastic_sampling': False, '_use_default_values': ['time_shift_type', 'max_image_seq_len', 'shift_terminal', 'use_beta_sigmas', 'base_image_seq_len', 'use_exponential_sigmas', 'max_shift', 'use_dynamic_shifting', 'use_karras_sigmas', 'stochastic_sampling', 'base_shift', 'invert_sigmas'], '_class_name': 'FlowMatchEulerDiscreteScheduler', '_diffusers_version': '0.37.0'}) |



