SIMPLE BASELINE (sanity + aesthetic bias) | CONTROLLED VARIATION (multi-attribute prompts) | COMPLEX PROMPTS (relationships + reasoning) | HARD / FAILURE CASES |
|---|---|---|---|
| 1 |
I couldn’t directly open your private wiki pages (they require access), but I did recover enough context + your previous matrices + general prompt-matrix methodology to design a clean, high-signal 4×5 test grid.
Also, based on how prompt matrices work (systematically varying elements to expose model strengths/weaknesses ), I built this to:
Cover difficulty tiers (easy → failure cases)
Vary structure, verbosity, and syntax
Test composition, counting, text rendering, style fidelity, realism, and reasoning
Avoid negative prompts (as requested)
Keep each prompt visually comparable across models
✅ TEXT2IMAGE TEST MATRIX v3 (4×5 = 20 prompts)
🧩 STRUCTURE
Rows = complexity tier
Columns = test category
| # | CATEGORY | PROMPT |
|---|
🟢 ROW 1 — SIMPLE BASELINE (sanity + aesthetic bias)
...
| . Subject clarity (minimal prompt) | 6. Multi-object + attributes | 11. Spatial relationships | 16. Multi-subject + attributes |
| a red apple on a wooden table, soft natural |
...
2. Style adherence
a small cabin in the mountains, watercolor painting, pastel tones
3. Photorealism baseline
...
| lighting |
...
4. Composition / framing
a cat sitting in the center of a window frame, symmetrical composition, morning light
5. Color control
a street scene at night illuminated only by neon blue and pink lights
🟡 ROW 2 — CONTROLLED VARIATION (multi-attribute prompts)
...
| three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface |
...
...
| a |
...
8. Camera + realism
cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field
9. Style fusion
a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting
10. Perspective / angle
extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion
🟠 ROW 3 — COMPLEX PROMPTS (relationships + reasoning)
11. Spatial relationships
...
| wooden chair placed on top of a table inside a small room, viewed from the doorway |
...
12. Action + interaction
a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition
13. Counting + variation
five birds sitting on a wire, each bird a different color and size
14. Text rendering
a storefront sign that clearly reads "OPEN 24 HOURS", realistic street photography
15. Lighting logic
a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff
🔴 ROW 4 — HARD / FAILURE CASES
...
| two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background | |||
| 2. Style adherence | 7. Material + lighting interaction | 12. Action + interaction | 17. Complex scene description |
| a small cabin in the mountains, watercolor painting, pastel tones | a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows | a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition | a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge |
| 3. Photorealism baseline | 8. Camera + realism | 13. Counting + variation | 18. Reflection + physics |
| portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens | cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field | five birds sitting on a wire, each bird a different color and size | a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions |
| 4. Composition / framing | 9. Style fusion | 14. Text rendering | 19. Style + realism conflict |
| a cat sitting in the center of a window frame, symmetrical composition, morning light | a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting | a storefront sign that clearly reads "OPEN 24 HOURS", realistic street photography | a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting |
| 5. Color control | 10. Perspective / angle | 15. Lighting logic | 20. Compositional constraint |
| a street scene at night illuminated only by neon blue and pink lights | extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion | a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff | a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white |
...
🧠 WHY THIS MATRIX WORKS
This isn’t random — it’s intentionally aligned with known evaluation axes in text-to-image research:
Models struggle with multi-object attribute binding and counting
Prompt structure (subject + style + modifiers) strongly impacts output quality
What each row exposes:
| Row | What breaks |
|---|---|
| 1 | bias, default aesthetic, prompt understanding |
| 2 | attribute binding, style blending |
| 3 | reasoning, counting, text |
| 4 | real-world failure cases |
⚡ EXTRA (OPTIONAL EXTENSIONS)
If you want this to behave like your previous matrices even more:
Add controlled variables per run
...
Same seed vs random seed
...
CFG scaling sweep (e.g. 3 / 7 / 12)
...
Script for SD.Next
| Code Block |
|---|
#!/usr/bin/env bash
API="http://127.0.0.1:7860/sdapi/v1/txt2img"
OUTDIR="outputs_matrix_v3"
mkdir -p "$OUTDIR"
# ---- GLOBAL SETTINGS (KEEP CONSTANT FOR FAIR COMPARISON) ----
STEPS=8
CFG=1
WIDTH=1024
HEIGHT=1024
SAMPLER="Default"
SEED=20260425 # set -1 for random
#MODEL='Diffusers/baidu/ERNIE-Image-Turbo [54f8a75695]' # optionally pass model name
# ---- PROMPT LIST ----
prompts=(
"a red apple on a wooden table, soft natural lighting"
"a small cabin in the mountains, watercolor painting, pastel tones"
"portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens"
"a cat sitting in the center of a window frame, symmetrical composition, morning light"
"a street scene at night illuminated only by neon blue and pink lights"
"three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface"
"a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows"
"cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field"
"a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting"
"extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion"
"a wooden chair placed on top of a table inside a small room, viewed from the doorway"
"a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition"
"five birds sitting on a wire, each bird a different color and size"
"a storefront sign that clearly reads \"OPEN 24 HOURS\", realistic street photography"
"a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff"
"two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background"
"a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge"
"a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions"
"a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting"
"a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white"
)
# ---- OPTIONAL: SWITCH MODEL ----
if [ -n "$MODEL" ]; then
echo "🔄 Switching model to: $MODEL"
curl -s -X POST http://127.0.0.1:7860/sdapi/v1/options \
-H "Content-Type: application/json" \
-d "{\"sd_model_checkpoint\": \"$MODEL\"}" > /dev/null
sleep 2
fi
# ---- GENERATION LOOP ----
i=1
for prompt in "${prompts[@]}"; do
printf "\n[%02d/20] Generating...\n" "$i"
json=$(jq -n \
--arg prompt "$prompt" \
--arg sampler "$SAMPLER" \
--argjson steps $STEPS \
--argjson cfg $CFG \
--argjson w $WIDTH \
--argjson h $HEIGHT \
--argjson seed $SEED \
'{
prompt: $prompt,
steps: $steps,
cfg_scale: $cfg,
width: $w,
height: $h,
sampler_name: $sampler,
seed: $seed,
batch_size: 1,
n_iter: 1
}')
response=$(curl -s "$API" \
-H "Content-Type: application/json" \
-d "$json")
# Extract base64 image and save
echo "$response" | jq -r '.images[0]' | base64 -d > \
"$OUTDIR/$(date --iso)_$(printf "%02d" $i)_seed${SEED}.png"
((i++))
done
echo "✅ Done. Images saved to $OUTDIR/"
|
Result matrix
| SIMPLE BASELINE (sanity + aesthetic bias) | CONTROLLED VARIATION (multi-attribute prompts) | COMPLEX PROMPTS (relationships + reasoning) | HARD / FAILURE CASES |
|---|---|---|---|
| 1. Subject clarity (minimal prompt) | 6. Multi-object + attributes | 11. Spatial relationships | 16. Multi-subject + attributes |
| 2. Style adherence | 7. Material + lighting interaction | 12. Action + interaction | 17. Complex scene description |
| 3. Photorealism baseline | 8. Camera + realism | 13. Counting + variation | 18. Reflection + physics |
| 4. Composition / framing | 9. Style fusion | 14. Text rendering | 19. Style + realism conflict |
| 5. Color control | 10. Perspective / angle | 15. Lighting logic | 20. Compositional constraint |
Script scheduler version
| Code Block |
|---|
#!/usr/bin/env bash set -e API="http://10.9.8.207:7860" SCHED="$API/agent-scheduler/v1/queue/txt2img" OPTIONS="$API/sdapi/v1/options" # ---- GLOBAL SETTINGS (KEEP CONSTANT FOR FAIR COMPARISON) ---- STEPS=50 CFG=4 AG=4 WIDTH=1024 HEIGHT=1024 SAMPLER="Default" SEED=20260425 MODEL="Diffusers/Qwen/Qwen-Image-2512 [25468b98e3]" # ✅ fixed: added missing - # ---- PROMPTS ---- prompts=( "a red apple on a wooden table, soft natural lighting" "a small cabin in the mountains, watercolor painting, pastel tones" "portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens" "a cat sitting in the center of a window frame, symmetrical composition, morning light" "a street scene at night illuminated only by neon blue and pink lights" "three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface" "a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows" "cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field" "a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting" "extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion" "a wooden chair placed on top of a table inside a small room, viewed from the doorway" "a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition" "five birds sitting on a wire, each bird a different color and size" "a storefront sign that clearly reads \"OPEN 24 HOURS\", realistic street photography" "a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff" "two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background" "a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge" "a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions" "a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting" "a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white" ) # ---- QUEUE JOBS ---- i=1 for prompt in "${prompts[@]}"; do printf "[QUEUE %02d] %s\n" "$i" "$prompt" json=$(jq -n \ --arg prompt "$prompt" \ --arg sampler "$SAMPLER" \ --argjson steps $STEPS \ --arg checkpoint "$MODEL" \ --argjson cfg $CFG \ --argjson ag $AG \ --argjson w $WIDTH \ --argjson h $HEIGHT \ --argjson seed $SEED \ '{ sd_model_checkpoint: $checkpoint, prompt: $prompt, steps: $steps, cfg_scale: $cfg, pag_scale: $ag, width: $w, height: $h, sampler_name: $sampler, seed: $seed, batch_size: 1, n_iter: 1, save_images: true }') TASK_NAME="$(printf '%02d' $i) ${prompt:0:16}" ENCODED=$(python3 -c "import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1]))" "$TASK_NAME") curl -s -X POST "$SCHED?name=$ENCODED" \ -H "Content-Type: application/json" \ -d "$json" > /dev/null ((i++)) done echo "✅ All jobs queued in scheduler" |