I build WordPress sites. I’ve spent years working with medical clients — clinics, healthcare platforms, systems where data privacy isn’t a nice-to-have, it’s the whole point. So when everyone started talking about AI assistants, my first question wasn’t “how do I use it?” It was “where does my data go?” The answer, most of the time, was “someone else’s server.” That wasn’t good enough. So I decided to run my own.
This is the story of a weekend I spent fighting my AMD GPU to make that happen — and what actually worked in the end. Fair warning: I’m comfortable in a terminal, but I wouldn’t call myself a Linux power user. I know enough to follow instructions, debug error messages, and make bad decisions at 11pm. If that sounds like you, this article is for you.
Why Local AI? Why Now?
As a WordPress developer working in and around the medical space, I’ve watched AI tools land on the market with a kind of cautious fascination. The productivity gains are real. GitHub Copilot, ChatGPT, Claude — I’ve tried them all, and they genuinely make coding faster. But every prompt I type leaves my machine. Every snippet of a client’s codebase I paste for debugging context goes to a third-party API. For consumer projects that’s fine. For anything touching healthcare — patient portals, clinic booking systems, anything near PHI — it’s a hard no.
Local AI solves this cleanly. The model runs on your hardware. Nothing leaves your network. You get the productivity benefits without the data residency risk. The catch is that “runs locally” usually means “runs on Nvidia,” and my desktop has an AMD RX 6600.
My setup: a desktop running Ubuntu with an ASRock AB350-Pro4 motherboard and an AMD RX 6600 — a solid RDNA2 card with 8GB VRAM that, AMD’s own documentation barely acknowledges in the ROCm compatibility tables. I wanted to run Lemonade, AMD’s own open-source LLM server, inside a Docker container, and then access it from my laptop over Tailscale to use with OpenCode in VS Code. Simple plan. Four hours of terminal output later, I had something to write about.
Why Lemonade?
There are plenty of ways to run local LLMs — Ollama, LM Studio, llama.cpp directly. But Lemonade is AMD’s own project, which means it should theoretically have the best ROCm support. It exposes an OpenAI-compatible API at localhost:13305/v1, ships as a Docker image, and has a surprisingly decent built-in web chat UI. It’s also actively maintained, which matters when you’re on bleeding-edge kernel versions.
My goal was to run it in Docker — isolated, easy to update, easy to nuke and restart if something breaks. Given what followed, that last part turned out to be prophetic.
Problem #1: The Installer That Breaks Itself
Step one of any ROCm setup is downloading AMD’s installer package:
wget https://repo.radeon.com/amdgpu-install/7.2.2/ubuntu/noble/amdgpu-install_7.2.2.70202-1_all.deb
sudo apt install ./amdgpu-install_7.2.2.70202-1_all.deb
The installer runs, seems happy, and then adds a repository to your apt sources. The repository it adds? Wrong. Instead of pointing at the actual amdgpu packages, it creates an amdgpu-install.list pointing at a graphics subdirectory that returns 404 errors. Every sudo apt update after this gives you a wall of red.
The AMD installer adds a broken repository on Ubuntu Noble. Delete it immediately and add the correct repos manually.
The fix is to throw it away and do it yourself:
# Kill the broken repo
sudo rm -f /etc/apt/sources.list.d/amdgpu-install.list
# Add the correct ones manually
echo "deb https://repo.radeon.com/amdgpu/7.2.2/ubuntu noble main" | sudo tee /etc/apt/sources.list.d/amdgpu.list
echo "deb [arch=amd64] https://repo.radeon.com/rocm/apt/7.2.2 noble main" | sudo tee /etc/apt/sources.list.d/rocm.list
sudo apt update
Problem #2: DKMS Is Dead on Kernel 6.17
The next wall was trying to build the amdgpu-dkms kernel module. AMD’s official guide says to install it. AMD’s kernel module does not agree with kernel 6.17:
Building initial module for 6.17.0-20-generic
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/amdgpu-dkms.0.crash'
Error! Bad return status for module build on kernel: 6.17.0-20-generic (x86_64)
dpkg: error processing package amdgpu-dkms (--configure):
installed amdgpu-dkms package post-installation script subprocess returned error exit status 10
This is a known upstream incompatibility — AMD’s proprietary DKMS module hasn’t been updated to compile against the API changes in kernels 6.16 and 6.17. The good news is that for Docker-based AI inference, we don’t need it at all. The RX 6600 (RDNA2/gfx1032) is fully supported by the in-kernel amdgpu driver that ships with Linux 6.17. DKMS adds extras for workstations and professional display setups — none of which matter here.
The --no-dkms flag skips the kernel module build entirely and installs only the ROCm userspace libraries:
# Install kernel headers first
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
# ROCm userspace only — skip DKMS entirely
sudo amdgpu-install --usecase=rocm --no-dkms
sudo update-initramfs -u
sudo reboot
After rebooting, verify the in-kernel driver loaded:
lsmod | grep amdgpu # should show the module
ls /dev/dri/ # card0, renderD128
ls /dev/kfd # ROCm compute node
One side note: at one point while debugging, I ran sudo modprobe amdgpu from my desktop terminal and it immediately logged me out. Alarming — until I realised that loading the module while the display server was already using it triggered a GPU reset. It was actually working. Always use a TTY (Ctrl+Alt+F3) for kernel module debugging.
Problem #3: Docker CE, Not docker.io
Ubuntu’s package manager ships an older Docker build called docker.io. It lacks the compose plugin and buildx plugin. Remove it and install Docker CE from Docker’s official repo:
# Remove old packages
sudo apt remove docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc
# Add Docker's official repo
sudo apt install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER
newgrp docker
The Docker Compose File (and One Non-Obvious Gotcha)
Getting the container to see the GPU requires passing the device nodes and the right group permissions. The group_add field in compose looks straightforward — until Docker throws:
Error response from daemon: Unable to find group render: no matching entries in group file
Docker’s group_add doesn’t resolve group names from the host — it needs numeric GIDs. Get them first:
getent group render # e.g. render:x:992:
getent group video # e.g. video:x:44:
Then the compose file, with the numbers substituted in. The port binding 0.0.0.0:13305:13305 is also important — without the explicit address, Docker only binds to localhost, and Tailscale can’t reach it.
services:
lemonade:
image: ghcr.io/lemonade-sdk/lemonade-server:latest
container_name: lemonade-server
ports:
- "0.0.0.0:13305:13305"
volumes:
- lemonade-cache:/root/.cache/huggingface
- lemonade-llama:/opt/lemonade/llama
- lemonade-recipe:/root/.cache/lemonade
environment:
- LEMONADE_LLAMACPP_BACKEND=rocm
devices:
- /dev/kfd
- /dev/dri
group_add:
- "992" # ⚠️ replace with your render GID
- "44" # ⚠️ replace with your video GID
restart: unless-stopped
volumes:
lemonade-cache:
lemonade-llama:
lemonade-recipe:
docker compose up -d
curl http://localhost:13305/api/v0/health
# {"status":"ok","version":"10.2.0",...}
Open http://localhost:13305 for the built-in web UI. One more gotcha: if you manage models via docker exec, the binary path matters. Just lemonade fails because /usr/local/bin isn’t in the exec PATH:
docker exec lemonade-server /usr/local/bin/lemonade pull Gemma-3-4b-it-GGUF
Problem #4: Tailscale and the GPG Key That Wasn’t There
The final piece was Tailscale — so I could use Lemonade from my laptop anywhere, routing prompts through my home GPU over an encrypted mesh network without touching router settings. The install script is one line, except it calls apt update internally, which failed because my ROCm repository was still missing its GPG signature.
The temporary repo I’d added earlier worked for installing ROCm, but wasn’t properly signed. The fix: import AMD’s GPG key properly before running the Tailscale installer.
# Re-add the rocm repo with proper GPG key
sudo rm -f /etc/apt/sources.list.d/rocm.list
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/rocm.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/7.2.2 noble main" | sudo tee /etc/apt/sources.list.d/rocm.list
sudo apt update
# Now Tailscale installs cleanly
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
Coding from My Laptop, GPU in the Other Room
With Tailscale up, the desktop’s Tailscale IP is one command away:
tailscale ip -4 # 100.x.x.x
sudo ufw allow 13305
From my laptop, I pointed OpenCode at http://100.x.x.x:13305/v1 with a dummy API key in the project’s opencode.json. Opened VS Code. Sent a prompt. Watched nvtop on the desktop spike to 80% GPU utilisation as the RX 6600 ground through the inference. The response came back in seconds.
It works. My laptop sends the request over Tailscale’s encrypted tunnel, my desktop’s GPU does the heavy lifting, and the response comes back as if it were hitting any OpenAI-compatible endpoint. No cloud, no API bill, no data leaving the house.
What Took the Most Time
| Problem | Time Lost | Fix |
|---|---|---|
| Broken AMD installer repo | ~30 min | Delete it, add repos manually |
| amdgpu-dkms on kernel 6.17 | ~45 min | Use --no-dkms |
| Docker group_add group names | ~20 min | Use numeric GIDs from getent |
| lemonade not in exec PATH | ~10 min | Use full /usr/local/bin/lemonade |
| ROCm GPG key breaking Tailscale install | ~40 min | Import GPG key with --dearmor first |
None of these are hard problems in hindsight. They’re just the kind of thing that’s invisible until you hit it — and that’s exactly why I wrote this down. The next time I need to reinstall (and there will be a next time), I’m following my own notes.
The RX 6600 turned out to be a perfectly capable local AI card once the software stack stopped fighting itself. 8GB VRAM handles 4B–7B parameter quantised models without breaking a sweat, and the latency over Tailscale is low enough that it feels local even from the laptop. Not bad for a GPU that AMD’s own documentation barely acknowledges in the ROCm compatibility tables.
Then, I needed to update to the AMD r9700 Ai Pro GPU, 32 GB VRAM. But thats another story.