Run Ollama Using Podman With AmdGPU
Recently I setup ollama with ollama-web as a systemd quadlet using podman. Here are the configs:
ollama.container
[Unit]
Description=The Ollama container
After=local-fs.target
[Service]
Restart=always
TimeoutStartSec=60
# Ensure there's a userland podman.sock
ExecStartPre=/bin/systemctl --user enable podman.socket
# Ensure that the dir exists
ExecStartPre=-mkdir -p %h/.ollama
[Container]
AutoUpdate=registry
ContainerName=ollama
Environment="HSA_OVERRIDE_GFX_VERSION=10.3.0"
PublishPort=11434:11434
RemapUsers=keep-id
RunInit=yes
NoNewPrivileges=no
Network=ollama.network
Volume=%h/.ollama:/.ollama
PodmanArgs=--userns=keep-id
PodmanArgs=--group-add=keep-groups
PodmanArgs=--ulimit=host
PodmanArgs=--security-opt=label=disable
PodmanArgs=--cgroupns=host
Image=docker.io/ollama/ollama:rocm
AddDevice=/dev/dri
AddDevice=/dev/kfd
[Install]
RequiredBy=multi-user.target
ollama-web.container
[Unit]
Description=An Ollama WebUI container
After=network-online.target ollama.service
Requires=ollama.service
[Container]
Image=ghcr.io/open-webui/open-webui:latest
AutoUpdate=registry
ContainerName=ollama-web
Environment=OLLAMA_BASE_URL=http://ollama:11434
Environment=WEBUI_SECRET_KEY=abc123
Environment=DEFAULT_USER_ROLE=admin
# Open WebUI does not allow access without a user account, nor does it allow
# account creation via environment variables.
Environment=ENABLE_SIGNUP=true
PublishPort=8080:8080
Network=ollama.network
Volume=%h/.ollama-web/open-webui:/app/backend/data
[Service]
TimeoutStartSec=900
ExecStartPre=-mkdir -p %h/.ollama-web
[Install]
WantedBy=multi-user.target
ollama.network
[Network]
NetworkName=ollama
To make sure the container has the correct access I performed the following check to verify that the container is able to list /dev/dri
and /dev/kdf
.
podman exec ollama ls -la /dev
total 0
drwxr-xr-x. 6 root root 380 Jan 25 18:26 .
dr-xr-xr-x. 18 root root 26 Jan 25 18:26 ..
lrwxrwxrwx. 1 root root 11 Jan 25 18:26 core -> /proc/kcore
drwxr-xr-x. 2 root root 80 Jan 25 18:26 dri
lrwxrwxrwx. 1 root root 13 Jan 25 18:26 fd -> /proc/self/fd
crw-rw-rw-. 1 nobody nogroup 1, 7 Jan 25 15:50 full
crw-rw-rw-. 1 nobody nogroup 234, 0 Jan 25 15:50 kfd
Initially I noticed that my GPU was not being utilized when queries to the LLM were performed. Doing a bit of digging I realized that my GPU is a gfx1031
, which is currently not supported. Currently supported GPUs are [gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]
. However it seems that we can force which library to use with the enviroment variable HSA_OVERRIDE_GFX_VERSION=10.3.0
. I am using 10.3.0
because it is closest to my gfx1031
.
Reference