...
| Code Block |
|---|
sudo apt update sudo apt upgrade sudo add-apt-repository ppa:deadsnakes/ppa sudo apt install python3.11 -y sudo apt install python3.11-venv -y python3.11 -V python3.11 -m venv llm_env source llm_env/bin/activate pip install --pre --upgrade ipex-llm[cpp] mkdir llama-cpp cd llama-cpp # Run Ollama Serve with Intel GPU export OLLAMA_NUM_GPU=999 export no_proxy=localhost,127.0.0.1 export ZES_ENABLE_SYSMAN=1 source /opt/intel/oneapi/setvars.sh export SYCL_CACHE_PERSISTENT=1 # localhost access # ./ollama serve # for non-localhost access OLLAMA_HOST=0.0.0.0 ./ollama serve |
list models
| Code Block |
|---|
(base) root@server1:~/llama-cpp# ./ollama list NAME ID SIZE MODIFIED qwen3:32b e1c9f234c6eb 20 GB 28 minutes ago gemma3:27b a418f5838eaf 17 GB 37 minutes ago deepseek-r1:70b 0c1615a8ca32 42 GB About an hour ago |
pull model
| Code Block |
|---|
(base) root@server1:~/llama-cpp# ./ollama list NAME ID SIZE MODIFIED qwen3:32b e1c9f234c6eb 20 GB 28 minutes ago gemma3:27b a418f5838eaf 17 GB 37 minutes ago deepseek-r1:70b 0c1615a8ca32 42 GB About an hour ago (base) root@server1:~/llama-cpp# ./ollama pull openchat:7b pulling manifest pulling 1cecc26325a1... 100% ▕████████████████████████████████████████████████████████████████████████████████ ▏ 7% ▕███████████ 4.1 GB/4.1 GB 102 MB/s 0s pulling 43070e2d4e53... 100% ▕████████████████████████████████████████████████████████████████████████████████▏ 11 KB pulling d68706c17530... 100% ▕████████████████████████████████████████████████████████████████████████████████▏ 98 B pulling 415f0f6b43dd... 100% ▕████████████████████████████████████████████████████████████████████████████████▏ 65 B pulling 278996753456... 100% ▕████████████████████████████████████████████████████████████████████████████████▏ 483 B verifying sha256 digest writing manifest success |
Web-UI
| Code Block |
|---|
source llm_env/bin/activate #pip install open-webui==0.2.5 pip install open-webui # 0.6.10 open-webui serve |
| sec to load model | layers to GPU | ||
|---|---|---|---|
| DeepSeek R1 Distill Llama 70B | 54.25 | 81/81 | |
llama.cpp
https://github.com/ggml-org/llama.cpp
...


