...
| Code Block |
|---|
(base) root@server1:~# ollama --version ollama version is 0.7.0 |
| Model |
|---|
| started in (seconds) | param | SIZE | CPU Model Buffer size | tokens/s | |
|---|---|---|---|---|---|
deepseek-r1:70b | 42 GB | ||||
llama3.3:70b | 42 GB | ||||
Qwen3 32B | 10.04 | 32B | 20 GB | 19259.71 MiB | |
phi3:14b | 3.52 | 14B | 7.9 GB | 7530.58 MiB | 15.12 tokens/s |
openchat7b | 4.1 GB | ||||
llama4:scout | |||||
gemma3:27b | 17 GB | ||||
mistral-small3.1:24b | 15 GB |
llama.cpp
https://github.com/ggml-org/llama.cpp
...