Overview
Hetzner Cloud offers the best price-per-core ratio in Europe for AI inference workloads. This guide gets you from zero to a hardened, Docker-ready server in under 30 minutes.
What you will have at the end:
- Non-root user with sudo
- UFW firewall (only ports 22, 80, 443 open)
- fail2ban blocking brute-force SSH
- Docker + Docker Compose installed
- Automated snapshot schedule
1. Create a Project
- Sign up at console.hetzner.cloud.
- Click New Project → name it (e.g.,
ai-infra). - Stay in the project — all resources below are created inside it.
2. Choose a Server Type
| Type | vCPU | RAM | Disk | Price/mo |
|---|---|---|---|---|
| CX22 | 2 | 4 GB | 40 GB NVMe | €4.51 |
| CPX31 | 4 | 8 GB | 160 GB NVMe | €13.49 |
| CPX41 | 8 | 16 GB | 240 GB NVMe | €26.49 |
Rule of thumb:
- CX22 → API proxy, lightweight agents, small Ollama models (≤3B).
- CPX31 → Ollama with 7–8B models, self-hosted inference gateway.
- CPX41 → Multiple models, vector DB, full agent stack.
Select Ubuntu 24.04 as the OS image.
3. Upload Your SSH Key
Before creating the server, add your public key:
# on your local machine
cat ~/.ssh/id_ed25519.pub
Copy the output. In Hetzner console: SSH Keys → Add SSH Key → paste → save.
If you do not have a key yet:
ssh-keygen -t ed25519 -C "hetzner-ai"
# press Enter twice for no passphrase (or set one)
Select your key during server creation. Hetzner injects it into /root/.ssh/authorized_keys automatically.
4. First Login
ssh root@<SERVER_IP>
# confirm the fingerprint on first connect
Update packages immediately:
apt update && apt upgrade -y
apt install -y curl wget git unzip
5. Create a Non-Root User
Never run production workloads as root.
adduser deploy
# enter a password, skip the other prompts
usermod -aG sudo deploy
# copy SSH key to the new user
rsync --archive --chown=deploy:deploy ~/.ssh /home/deploy
Test the new user in a second terminal before closing root session:
ssh deploy@<SERVER_IP>
sudo whoami # should return: root
6. Harden SSH
Edit /etc/ssh/sshd_config:
sudo nano /etc/ssh/sshd_config
Set these values (add or uncomment):
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
Port 22
Reload:
sudo systemctl reload ssh
7. Configure UFW Firewall
sudo apt install -y ufw
# default: deny incoming, allow outgoing
sudo ufw default deny incoming
sudo ufw default allow outgoing
# allow your access
sudo ufw allow 22/tcp # SSH
sudo ufw allow 80/tcp # HTTP
sudo ufw allow 443/tcp # HTTPS
sudo ufw --force enable
sudo ufw status verbose
Output should show three ALLOW rules and status: active.
8. Install fail2ban
fail2ban bans IPs after repeated failed SSH attempts.
sudo apt install -y fail2ban
sudo tee /etc/fail2ban/jail.local <<'EOF'
[DEFAULT]
bantime = 10m
findtime = 10m
maxretry = 5
[sshd]
enabled = true
port = 22
EOF
sudo systemctl enable --now fail2ban
sudo fail2ban-client status sshd
9. Install Docker
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker deploy
# log out and back in for group to take effect
Verify:
docker run --rm hello-world
Install Docker Compose plugin (included since Docker 24, but confirm):
docker compose version
10. Set Up Automated Snapshots
Hetzner charges €0.0119/GB/month for snapshots. For a CX22 (40 GB, typically 5–10 GB compressed), that is ~€0.06–0.12/mo.
In Hetzner console: Server → Backups → enable (adds 20% to server cost) or use snapshots manually before upgrades:
# via hcloud CLI (optional)
hcloud server create-image <SERVER_ID> --type snapshot --description "pre-upgrade-$(date +%Y%m%d)"
For CLI access:
# install hcloud on your local machine
curl -Lo hcloud.tar.gz https://github.com/hetznercloud/cli/releases/latest/download/hcloud-linux-amd64.tar.gz
tar -xzf hcloud.tar.gz
sudo mv hcloud /usr/local/bin/
hcloud context create ai-infra # paste API token from console
11. Point a DNS Record
In your DNS provider, add an A record:
A ai.yourdomain.com <SERVER_IP> TTL 300
Verify propagation:
dig +short ai.yourdomain.com
# should return your server IP within 5 minutes
12. Cost Summary
| Config | Use Case | Monthly Cost |
|---|---|---|
| CX22 | API proxy, small agents | ~€4.51 |
| CPX31 + snapshots | 7B models, agent stack | ~€14–16 |
| CPX31 + IPv4 + backups | Production inference | ~€16–18 |
IPv4 addresses cost an additional €0.50/mo. IPv6 is free and works for most setups.
Next Steps
- Install Ollama on the server:
curl -fsSL https://ollama.com/install.sh | sh - Set up Nginx as a reverse proxy with Let's Encrypt
- Deploy an OpenAI-compatible inference gateway (LiteLLM, llama.cpp server)