Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
source llm_env/bin/activate
#pip install open-webui==0.2.5
pip install open-webui # 0.6.10
open-webui serve


Benchmark LLM

Code Block
git clone https://github.com/tabletuser-blogspot/ollama-benchmark
cd ollama-benchmark/
chmod +x obench.sh
time ./obench.sh



GPU backend models performance

...

Code Block
(base) root@server1:~# ollama --version
ollama version is 0.7.0



Modelstarted in (seconds)paramSIZEprompt eval rateeval rate

deepseek-r1:70b

21.3470B42 GB2.20 tokens/s1.24 tokens/s

llama3.3:70b

21.3470B42 GB2.39 tokens/s1.23 tokens/s

qwen3:32b

10.0432B20 GB5.63 tokens/s2.54 tokens/s

gemma3:27b

1.76

27B17 GB6.66 tokens/s3.03 tokens/s

mistral-small3.1:24b

3.26

24B15 GB7.72 tokens/s3.60 tokens/s

llama4:scout

13.55

17B67 GB11.47 tokens/s4.76 tokens/s

deepseek-v2:16b

4.02

16B8.9 GB58.75 tokens/s24.50 tokens/s

phi3:14b

3.5214B7.9 GB15.12 tokens/s6.05 tokens/s

openchat:7b

2.517B4.1 GB30.37 tokens/s11.19 tokens/s


llama.cpp

https://github.com/ggml-org/llama.cpp

...