...
sensors (deepseek-r1:70b execution on CPU) at power consumption ~80W
| Code Block |
|---|
| title | sensors (CPU) |
|---|
| collapse | true |
|---|
|
sensors
iwlwifi_1-virtual-0
Adapter: Virtual device
temp1: N/A
spd5118-i2c-6-50
Adapter: SMBus I801 adapter at efa0
temp1: +78.2°C (low = +0.0°C, high = +55.0°C)
(crit low = +0.0°C, crit = +85.0°C)
nvme-pci-0200
Adapter: PCI adapter
Composite: +39.9°C (low = -273.1°C, high = +82.8°C)
(crit = +84.8°C)
acpi_fan-acpi-0
Adapter: ACPI interface
fan1: N/A
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +101.0°C (high = +110.0°C, crit = +110.0°C)
Core 0: +83.0°C (high = +110.0°C, crit = +110.0°C)
Core 1: +83.0°C (high = +110.0°C, crit = +110.0°C)
Core 2: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 3: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 4: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 5: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 6: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 7: +84.0°C (high = +110.0°C, crit = +110.0°C)
Core 8: +101.0°C (high = +110.0°C, crit = +110.0°C)
Core 12: +100.0°C (high = +110.0°C, crit = +110.0°C)
Core 16: +100.0°C (high = +110.0°C, crit = +110.0°C)
Core 20: +99.0°C (high = +110.0°C, crit = +110.0°C)
Core 24: +97.0°C (high = +110.0°C, crit = +110.0°C)
Core 28: +100.0°C (high = +110.0°C, crit = +110.0°C)
Core 32: +73.0°C (high = +110.0°C, crit = +110.0°C)
Core 33: +73.0°C (high = +110.0°C, crit = +110.0°C)
nvme-pci-0100
Adapter: PCI adapter
Composite: +56.9°C (low = -5.2°C, high = +89.8°C)
(crit = +93.8°C)
Sensor 1: +70.8°C (low = -273.1°C, high = +65261.8°C)
Sensor 2: +47.9°C (low = -273.1°C, high = +65261.8°C)
Sensor 3: +46.9°C (low = -273.1°C, high = +65261.8°C)
acpitz-acpi-0
Adapter: ACPI interface
temp1: +27.8°C
|
sensors (deepseek-r1:70b execution on GPU) at power consumption ~60W
| Code Block |
|---|
| title | sensors (GPU) |
|---|
| collapse | true |
|---|
|
(base) root@server1:~# sensors
iwlwifi_1-virtual-0
Adapter: Virtual device
temp1: N/A
spd5118-i2c-6-50
Adapter: SMBus I801 adapter at efa0
temp1: +82.2°C (low = +0.0°C, high = +55.0°C)
(crit low = +0.0°C, crit = +85.0°C)
nvme-pci-0200
Adapter: PCI adapter
Composite: +39.9°C (low = -273.1°C, high = +82.8°C)
(crit = +84.8°C)
acpi_fan-acpi-0
Adapter: ACPI interface
fan1: N/A
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +97.0°C (high = +110.0°C, crit = +110.0°C)
Core 0: +58.0°C (high = +110.0°C, crit = +110.0°C)
Core 1: +59.0°C (high = +110.0°C, crit = +110.0°C)
Core 2: +58.0°C (high = +110.0°C, crit = +110.0°C)
Core 3: +59.0°C (high = +110.0°C, crit = +110.0°C)
Core 4: +67.0°C (high = +110.0°C, crit = +110.0°C)
Core 5: +68.0°C (high = +110.0°C, crit = +110.0°C)
Core 6: +67.0°C (high = +110.0°C, crit = +110.0°C)
Core 7: +67.0°C (high = +110.0°C, crit = +110.0°C)
Core 8: +54.0°C (high = +110.0°C, crit = +110.0°C)
Core 12: +97.0°C (high = +110.0°C, crit = +110.0°C)
Core 16: +59.0°C (high = +110.0°C, crit = +110.0°C)
Core 20: +77.0°C (high = +110.0°C, crit = +110.0°C)
Core 24: +56.0°C (high = +110.0°C, crit = +110.0°C)
Core 28: +61.0°C (high = +110.0°C, crit = +110.0°C)
Core 32: +63.0°C (high = +110.0°C, crit = +110.0°C)
Core 33: +63.0°C (high = +110.0°C, crit = +110.0°C)
nvme-pci-0100
Adapter: PCI adapter
Composite: +59.9°C (low = -5.2°C, high = +89.8°C)
(crit = +93.8°C)
Sensor 1: +73.8°C (low = -273.1°C, high = +65261.8°C)
Sensor 2: +50.9°C (low = -273.1°C, high = +65261.8°C)
Sensor 3: +49.9°C (low = -273.1°C, high = +65261.8°C)
acpitz-acpi-0
Adapter: ACPI interface
temp1: +27.8°C
|
...
| Code Block |
|---|
| title | batch-obench.sh |
|---|
| collapse | true |
|---|
|
#!/bin/bash
# Benchmark using ollama gives rate of tokens per second
# idea taken from https://taoofmac.com/space/blog/2024/01/20/1800
# batch-obench.sh script is modification of obench.sh from https://github.com/tabletuser-blogspot/ollama-benchmark
# done by liutyi for https://wiki.liutyi.info test
set -e
borange='\e[0;33m'
yellow='\e[1;33m'
purple='\e[0;35m'
green='\e[0;32m'
red='\e[0;31m'
blue='\e[0;34m'
NC='\e[0m' # No Color
cpu_def=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)
echo "Setting cpu governor to"
sudo echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
gpu_avail=$(sudo lshw -C display | grep product: | head -1 | cut -c17-)
cpugover=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)
cpu_used=$(lscpu | grep 'Model name' | cut -f 2 -d ":" | awk '{$1=$1}1')
echo ""
echo "Simple benchmark using ollama and"
echo "whatever local Model is installed."
echo "Does not identify if $gpu_avail is benchmarking"
echo ""
benchmark=3
echo "How many times to run the benchmark?"
echo $benchmark
echo ""
for model in `ollama ls |awk '{print $1}'|grep -v NAME`; do
echo -e "Total runs "${purple}$benchmark${NC}
echo ""
echo ""
echo $model
ollama show $model --system
echo "" | tee -a results.txt
echo -e "Will use model: "${green}$model${NC} | tee -a results.txt
echo "" | tee -a results.txt
echo -e Will benchmark the tokens per second for ${cpu_used} and or ${gpu_avail} | tee -a results.txt
echo "" | tee -a results.txt
echo "" | tee -a results.txt
echo -e Running benchmark ${purple}$benchmark${NC} times for ${cpu_used} and or ${gpu_avail} | tee -a results.txt
echo -e with ${borange}$cpugover${NC} setting for cpu governor | tee -a results.txt
echo "" | tee -a results.txt
for run in $(seq 1 $benchmark); do
echo "Why is the blue sky blue?" | ollama run $model --verbose 2>&1 >/dev/null | grep "eval rate:" | tee -a results.txt ;
avg=$(cat results.txt | grep -v "prompt eval rate:" |tail -n $benchmark | awk '{print $3}' | awk 'NR>1{ tot+=$1 } END{ print tot/(NR-1) }')
done
echo "" | tee -a results.txt
echo -e ${red}$avg${NC} is the average ${blue}tokens per second${NC} using ${green}$model${NC} model | tee -a results.txt
echo for $cpu_used and or $gpu_avail | tee -a results.txt
done
echo
echo -e using ${borange}$cpugover${NC} for cpu governor.
echo ""
echo "Setting cpu governor to"
sudo echo $cpu_def | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
echo . |
quant types (copy/paste FROM REDIT)
| Code Block |
|---|
| title | quant types |
|---|
| collapse | true |
|---|
|
Old quant types (some base model types require these):
- Q4_0: small, very high quality loss - legacy, prefer using Q3_K_M
- Q4_1: small, substantial quality loss - legacy, prefer using Q3_K_L
- Q5_0: medium, balanced quality - legacy, prefer using Q4_K_M
- Q5_1: medium, low quality loss - legacy, prefer using Q5_K_M
New quant types (recommended):
- Q2_K: smallest, extreme quality loss - not recommended
- Q3_K: alias for Q3_K_M
- Q3_K_S: very small, very high quality loss
- Q3_K_M: very small, very high quality loss
- Q3_K_L: small, substantial quality loss
- Q4_K: alias for Q4_K_M
- Q4_K_S: small, significant quality loss
- Q4_K_M: medium, balanced quality - recommended
- Q5_K: alias for Q5_K_M
- Q5_K_S: large, low quality loss - recommended
- Q5_K_M: large, very low quality loss - recommended
- Q6_K: very large, extremely low quality loss
- Q8_0: very large, extremely low quality loss - not recommended
- F16: extremely large, virtually no quality loss - not recommended
- F32: absolutely huge, lossless - not recommended |
...