Run Ollama Using Podman With AmdGPU

Recently I setup ollama with ollama-web as a systemd quadlet using podman. Here are the configs:

ollama.container

[Unit]
Description=The Ollama container
After=local-fs.target

[Service]
Restart=always
TimeoutStartSec=60
# Ensure there's a userland podman.sock
ExecStartPre=/bin/systemctl --user enable podman.socket
# Ensure that the dir exists
ExecStartPre=-mkdir -p %h/.ollama

[Container]
AutoUpdate=registry
ContainerName=ollama
Environment="HSA_OVERRIDE_GFX_VERSION=10.3.0"
PublishPort=11434:11434
RemapUsers=keep-id
RunInit=yes
NoNewPrivileges=no
Network=ollama.network
Volume=%h/.ollama:/.ollama
PodmanArgs=--userns=keep-id
PodmanArgs=--group-add=keep-groups
PodmanArgs=--ulimit=host
PodmanArgs=--security-opt=label=disable
PodmanArgs=--cgroupns=host

Image=docker.io/ollama/ollama:rocm
AddDevice=/dev/dri
AddDevice=/dev/kfd

[Install]
RequiredBy=multi-user.target

ollama-web.container

[Unit]
Description=An Ollama WebUI container
After=network-online.target ollama.service
Requires=ollama.service

[Container]
Image=ghcr.io/open-webui/open-webui:latest
AutoUpdate=registry
ContainerName=ollama-web
Environment=OLLAMA_BASE_URL=http://ollama:11434
Environment=WEBUI_SECRET_KEY=abc123
Environment=DEFAULT_USER_ROLE=admin
# Open WebUI does not allow access without a user account, nor does it allow
# account creation via environment variables.
Environment=ENABLE_SIGNUP=true
PublishPort=8080:8080
Network=ollama.network
Volume=%h/.ollama-web/open-webui:/app/backend/data

[Service]
TimeoutStartSec=900
ExecStartPre=-mkdir -p %h/.ollama-web

[Install]
WantedBy=multi-user.target

ollama.network

[Network]
NetworkName=ollama

To make sure the container has the correct access I performed the following check to verify that the container is able to list /dev/dri and /dev/kdf.

podman exec ollama ls -la /dev
total 0
drwxr-xr-x.  6 root   root       380 Jan 25 18:26 .
dr-xr-xr-x. 18 root   root        26 Jan 25 18:26 ..
lrwxrwxrwx.  1 root   root        11 Jan 25 18:26 core -> /proc/kcore
drwxr-xr-x.  2 root   root        80 Jan 25 18:26 dri
lrwxrwxrwx.  1 root   root        13 Jan 25 18:26 fd -> /proc/self/fd
crw-rw-rw-.  1 nobody nogroup   1, 7 Jan 25 15:50 full
crw-rw-rw-.  1 nobody nogroup 234, 0 Jan 25 15:50 kfd

Initially I noticed that my GPU was not being utilized when queries to the LLM were performed. Doing a bit of digging I realized that my GPU is a gfx1031, which is currently not supported. Currently supported GPUs are [gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]. However it seems that we can force which library to use with the enviroment variable HSA_OVERRIDE_GFX_VERSION=10.3.0. I am using 10.3.0 because it is closest to my gfx1031.

Reference