SIMPLE BASELINE (sanity + aesthetic bias) | CONTROLLED VARIATION (multi-attribute prompts) | COMPLEX PROMPTS (relationships + reasoning) | HARD / FAILURE CASES |
|---|---|---|---|
| 1. Subject clarity (minimal prompt) | 6. Multi-object + attributes | 11. Spatial relationships | 16. Multi-subject + attributes |
| a red apple on a wooden table, soft natural lighting | three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface | a wooden chair placed on top of a table inside a small room, viewed from the doorway | two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background |
| 2. Style adherence | 7. Material + lighting interaction | 12. Action + interaction | 17. Complex scene description |
| a small cabin in the mountains, watercolor painting, pastel tones | a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows | a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition | a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge |
| 3. Photorealism baseline | 8. Camera + realism | 13. Counting + variation | 18. Reflection + physics |
| portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens | cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field | five birds sitting on a wire, each bird a different color and size | a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions |
| 4. Composition / framing | 9. Style fusion | 14. Text rendering | 19. Style + realism conflict |
| a cat sitting in the center of a window frame, symmetrical composition, morning light | a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting | a storefront sign that clearly reads "OPEN 24 HOURS", realistic street photography | a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting |
| 5. Color control | 10. Perspective / angle | 15. Lighting logic | 20. Compositional constraint |
| a street scene at night illuminated only by neon blue and pink lights | extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion | a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff | a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white |
#!/usr/bin/env bash
API="http://127.0.0.1:7860/sdapi/v1/txt2img"
OUTDIR="outputs_matrix_v3"
mkdir -p "$OUTDIR"
# ---- GLOBAL SETTINGS (KEEP CONSTANT FOR FAIR COMPARISON) ----
STEPS=8
CFG=1
WIDTH=1024
HEIGHT=1024
SAMPLER="Default"
SEED=20260425 # set -1 for random
#MODEL='Diffusers/baidu/ERNIE-Image-Turbo [54f8a75695]' # optionally pass model name
# ---- PROMPT LIST ----
prompts=(
"a red apple on a wooden table, soft natural lighting"
"a small cabin in the mountains, watercolor painting, pastel tones"
"portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens"
"a cat sitting in the center of a window frame, symmetrical composition, morning light"
"a street scene at night illuminated only by neon blue and pink lights"
"three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface"
"a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows"
"cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field"
"a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting"
"extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion"
"a wooden chair placed on top of a table inside a small room, viewed from the doorway"
"a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition"
"five birds sitting on a wire, each bird a different color and size"
"a storefront sign that clearly reads \"OPEN 24 HOURS\", realistic street photography"
"a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff"
"two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background"
"a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge"
"a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions"
"a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting"
"a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white"
)
# ---- OPTIONAL: SWITCH MODEL ----
if [ -n "$MODEL" ]; then
echo "🔄 Switching model to: $MODEL"
curl -s -X POST http://127.0.0.1:7860/sdapi/v1/options \
-H "Content-Type: application/json" \
-d "{\"sd_model_checkpoint\": \"$MODEL\"}" > /dev/null
sleep 2
fi
# ---- GENERATION LOOP ----
i=1
for prompt in "${prompts[@]}"; do
printf "\n[%02d/20] Generating...\n" "$i"
json=$(jq -n \
--arg prompt "$prompt" \
--arg sampler "$SAMPLER" \
--argjson steps $STEPS \
--argjson cfg $CFG \
--argjson w $WIDTH \
--argjson h $HEIGHT \
--argjson seed $SEED \
'{
prompt: $prompt,
steps: $steps,
cfg_scale: $cfg,
width: $w,
height: $h,
sampler_name: $sampler,
seed: $seed,
batch_size: 1,
n_iter: 1
}')
response=$(curl -s "$API" \
-H "Content-Type: application/json" \
-d "$json")
# Extract base64 image and save
echo "$response" | jq -r '.images[0]' | base64 -d > \
"$OUTDIR/$(date --iso)_$(printf "%02d" $i)_seed${SEED}.png"
((i++))
done
echo "✅ Done. Images saved to $OUTDIR/"
|
| SIMPLE BASELINE (sanity + aesthetic bias) | CONTROLLED VARIATION (multi-attribute prompts) | COMPLEX PROMPTS (relationships + reasoning) | HARD / FAILURE CASES |
|---|---|---|---|
| 1. Subject clarity (minimal prompt) | 6. Multi-object + attributes | 11. Spatial relationships | 16. Multi-subject + attributes |
| 2. Style adherence | 7. Material + lighting interaction | 12. Action + interaction | 17. Complex scene description |
| 3. Photorealism baseline | 8. Camera + realism | 13. Counting + variation | 18. Reflection + physics |
| 4. Composition / framing | 9. Style fusion | 14. Text rendering | 19. Style + realism conflict |
| 5. Color control | 10. Perspective / angle | 15. Lighting logic | 20. Compositional constraint |
Script scheduler version
#!/usr/bin/env bash
set -e
API="http://10.9.8.207:7860"
SCHED="$API/agent-scheduler/v1/queue/txt2img"
OPTIONS="$API/sdapi/v1/options"
# ---- GLOBAL SETTINGS (KEEP CONSTANT FOR FAIR COMPARISON) ----
STEPS=50
CFG=4
AG=4
WIDTH=1024
HEIGHT=1024
SAMPLER="Default"
SEED=20260425
MODEL="Diffusers/Qwen/Qwen-Image-2512 [25468b98e3]" # ✅ fixed: added missing -
# ---- PROMPTS ----
prompts=(
"a red apple on a wooden table, soft natural lighting"
"a small cabin in the mountains, watercolor painting, pastel tones"
"portrait photo of a 35 year old man, neutral expression, studio lighting, 85mm lens"
"a cat sitting in the center of a window frame, symmetrical composition, morning light"
"a street scene at night illuminated only by neon blue and pink lights"
"three glass bottles, one filled with red liquid, one blue, one green, arranged in a row on a reflective surface"
"a chrome sphere and a matte black cube on a white surface, strong directional sunlight casting sharp shadows"
"cinematic photo of a woman walking in rain, wet asphalt reflections, shot on 50mm lens, shallow depth of field"
"a futuristic city skyline in the style of cyberpunk and art deco, highly detailed, dramatic lighting"
"extreme low angle view of a towering skyscraper disappearing into fog, wide angle lens distortion"
"a wooden chair placed on top of a table inside a small room, viewed from the doorway"
"a chef flipping a pancake in mid air in a busy kitchen, motion blur, dynamic composition"
"five birds sitting on a wire, each bird a different color and size"
"a storefront sign that clearly reads \"OPEN 24 HOURS\", realistic street photography"
"a candle lighting a dark room, objects gradually fading into shadow, realistic light falloff"
"two identical twins, one wearing black suit and one wearing white suit, standing side by side, neutral background"
"a cluttered desk with a laptop, a coffee mug, scattered papers, a glowing desk lamp, and a small plant near the edge"
"a glass of water on a mirror surface reflecting a sunset sky, realistic reflections and refractions"
"a hyper realistic photograph of a dragon sitting in a modern living room, natural lighting"
"a perfectly centered circle inside a square frame, minimalistic design, high contrast black and white"
)
# ---- QUEUE JOBS ----
i=1
for prompt in "${prompts[@]}"; do
printf "[QUEUE %02d] %s\n" "$i" "$prompt"
json=$(jq -n \
--arg prompt "$prompt" \
--arg sampler "$SAMPLER" \
--argjson steps $STEPS \
--arg checkpoint "$MODEL" \
--argjson cfg $CFG \
--argjson ag $AG \
--argjson w $WIDTH \
--argjson h $HEIGHT \
--argjson seed $SEED \
'{
sd_model_checkpoint: $checkpoint,
prompt: $prompt,
steps: $steps,
cfg_scale: $cfg,
pag_scale: $ag,
width: $w,
height: $h,
sampler_name: $sampler,
seed: $seed,
batch_size: 1,
n_iter: 1,
save_images: true
}')
TASK_NAME="$(printf '%02d' $i) ${prompt:0:16}"
ENCODED=$(python3 -c "import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1]))" "$TASK_NAME")
curl -s -X POST "$SCHED?name=$ENCODED" \
-H "Content-Type: application/json" \
-d "$json" > /dev/null
((i++))
done
echo "✅ All jobs queued in scheduler"
|