Info
andinsky 5.0 Image Lite is a line-up of 6B image generation models with the following capabilities:
1K
...
resolution (1280x768, 1024x1024 and others).
High visual quality
Strong text-writing
Russian concepts understanding
https://github.com/kandinskylab/kandinsky-5
...
https://habr.com/ru/companies/sberbank/articles/951800/
| transformer | Text Encoder | Text Encoder 2 | scheduler | tokenizer | tokenizer 2 | vae |
|---|---|---|---|---|---|---|
| Kandinsky5Transformer3DModel | Qwen2.5-VL-7B-Instruct | openai/ |
...
| clip-vit-large-patch14 | Euler FlowMatch | AutoProcessor Qwen2.5-VL | CLIPTokenizer | FLUX.1-dev vae |
| Code Block |
|---|
Examples:
```python
>>> import torch
>>> from diffusers import Kandinsky5T2IPipeline
>>> # Available models:
>>> # kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers
>>> # kandinskylab/Kandinsky-5.0-T2I-Lite-pretrain-Diffusers
>>> model_id = "kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers"
>>> pipe = Kandinsky5T2IPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
>>> pipe = pipe.to("cuda")
>>> prompt = "A cat and a dog baking a cake together in a kitchen."
>>> output = pipe(
... prompt=prompt,
... negative_prompt="",
... height=1024,
... width=1024,
... num_inference_steps=50,
... guidance_scale=3.5,
... ).frames[0]
``` |
| Code Block |
prompt = " |
Test 0 - Different seed variations
...