feat(ess_pro): deploy Element Server Suite Pro via K3s + Helm

Adds k3s and ess_pro roles to replace the planned Nextcloud Talk
stack. Integrates with existing Keycloak (OIDC), Garage (S3 media)
and OpenBao (secrets). Hostnames under digitalboard.ch.
This commit is contained in:
Tobias Wüst 2026-05-27 23:46:37 +02:00
parent c11f019aae
commit 01fd12d75c
18 changed files with 1098 additions and 0 deletions

221
roles/ess-pro/README.md Normal file
View file

@ -0,0 +1,221 @@
# Ansible Role: ess_pro
Deploys Element Server Suite Pro on a single-node K3s cluster, using the
official `oci://registry.element.io/matrix-stack` Helm chart.
Follows the conventions of the other `digitalboard.core` roles
(bookstack, opnform, homarr): the role itself is secrets-agnostic;
sensitive values are supplied via `group_vars/ess_servers.yml` as
`community.hashi_vault` lookups against OpenBao.
Replaces the previously-planned `coturn` + `nextcloud-talk-hpb`
(spreed-signaling + Janus) stack with a fully-fledged Matrix backend
(Synapse Pro, MAS, Element Web, Element Admin, Element Call / LiveKit).
---
## Hostnames
| Component | Default hostname |
| --------------------- | ----------------------------- |
| Matrix `serverName` | `digitalboard.ch` |
| Synapse | `matrix.digitalboard.ch` |
| MAS | `mas.digitalboard.ch` |
| Element Web | `chat.digitalboard.ch` |
| Element Admin Panel | `admin.digitalboard.ch` |
| Matrix RTC / LiveKit | `rtc.digitalboard.ch` |
| `.well-known` apex | `digitalboard.ch` |
Note: MAS uses `mas.*` because `auth.digitalboard.ch` is already owned
by Keycloak in the reference infrastructure. Override the whole map via
`ess_pro_hostnames` if needed (e.g. for `wksbern.ch`).
---
## Architecture
```
┌────────────────────────────────────────────┐
Internet ──HTTPS──▶│ DMZ Traefik (reference-ansible) │
│ chat.* mas.* matrix.* admin.* rtc.* │
└───────────────────┬─────────────────────────┘
│ HTTP (TLS terminated)
┌────────────────────────────────────────────┐
│ ess host (Debian bookworm + K3s) │
│ ┌──────────────────────────────────────┐ │
│ │ ess namespace │ │
│ │ • synapse-pro │ │
│ │ • matrix-authentication-service │ │
│ │ • element-web │ │
│ │ • element-admin │ │
│ │ • matrix-rtc (lk-jwt + LiveKit SFU) │ │
│ │ • haproxy / well-known │ │
│ └──────────────────────────────────────┘ │
└────────────────────────────────────────────┘
│ UDP 5000060000
LiveKit ICE candidates
```
Integration with the existing reference-ansible stack:
- **DMZ Traefik** terminates TLS, forwards HTTP to the K3s node.
- **Keycloak** on `auth.digitalboard.ch` (Realm `Digitalboard`) is MAS'
upstream OIDC provider — same SSO story as bookstack/opnform/homarr.
- **Garage** (S3-compatible) hosts the Synapse media store via the
`ess-media` bucket.
- **OpenBao** on the same path layout (`kv/digitalboard/<service>`).
- **Cluster lives on the same VM** that was previously planned for
coturn/HPB, because it has the right DMZ NAT topology for SFU UDP.
---
## Prerequisites
1. Ansible collections on the control node:
```bash
ansible-galaxy collection install \
kubernetes.core community.general community.hashi_vault
pip install kubernetes pyyaml hvac
```
2. ESS Pro subscription credentials in OpenBao at
`kv/digitalboard/ess-pro` (KV v2, flat keys):
```bash
bao kv put kv/digitalboard/ess-pro \
username='ess-customer-xxx' \
token='paste-from-customer.element.io' \
client_secret='from-keycloak' \
s3_access_key='...' \
s3_secret_key='...'
```
See `examples/openbao-bootstrap.sh` for an interactive helper.
3. Keycloak OIDC client `ess-mas` in the `Digitalboard` realm with
redirect URI
`https://mas.digitalboard.ch/upstream/callback/01J0KCK0DNNNDIGITALBOARDKC01`.
4. Garage bucket `ess-media` with a dedicated access key.
5. DNS A/AAAA records for `matrix.`, `mas.`, `chat.`, `admin.`, `rtc.`
and the apex `digitalboard.ch`, pointing at the DMZ Traefik.
6. DMZ firewall NAT-forwards UDP `50000-60000` (configurable) and TCP
`7881` to the K3s node — LiveKit's media ports.
---
## Required variables
| Variable | Notes |
| --------------------------------- | ------------------------------------ |
| `ess_pro_registry_username` | OpenBao lookup — see example |
| `ess_pro_registry_token` | OpenBao lookup |
| `ess_pro_oidc_client_secret` | OpenBao lookup (when OIDC enabled) |
| `ess_pro_s3_access_key` | OpenBao lookup (when S3 enabled) |
| `ess_pro_s3_secret_key` | OpenBao lookup (when S3 enabled) |
| `ess_pro_rtc_external_ip` | DMZ public IP for LiveKit ICE |
See `defaults/main.yml` for everything else. The `examples/` directory
contains a ready-to-use `group_vars/ess_servers.yml` with all the
OpenBao lookups pre-wired.
---
## Example playbook
```yaml
- name: Deploy Element Server Suite Pro
hosts: ess_servers
become: true
roles:
- digitalboard.core.k3s
- digitalboard.core.ess_pro
```
With inventory variables (`group_vars/ess_servers.yml`):
```yaml
ess_pro_server_name: "digitalboard.ch"
ess_pro_oidc_enabled: true
ess_pro_s3_media_enabled: true
ess_pro_rtc_external_ip: "203.0.113.42"
ess_pro_registry_username: "{{ lookup('community.hashi_vault.vault_kv2_get',
'digitalboard/ess-pro',
mount_point='kv').data.data.username }}"
ess_pro_registry_token: "{{ lookup('community.hashi_vault.vault_kv2_get',
'digitalboard/ess-pro',
mount_point='kv').data.data.token }}"
ess_pro_oidc_client_secret: "{{ lookup('community.hashi_vault.vault_kv2_get',
'digitalboard/ess-pro',
mount_point='kv').data.data.client_secret }}"
ess_pro_s3_access_key: "{{ lookup('community.hashi_vault.vault_kv2_get',
'digitalboard/ess-pro',
mount_point='kv').data.data.s3_access_key }}"
ess_pro_s3_secret_key: "{{ lookup('community.hashi_vault.vault_kv2_get',
'digitalboard/ess-pro',
mount_point='kv').data.data.s3_secret_key }}"
```
---
## Post-deploy
1. Get the bootstrap admin password:
```bash
kubectl -n ess get secrets/ess-generated \
-o jsonpath='{.data.ADMIN_USER_PASSWORD}' | base64 -d
```
2. Log in to `https://admin.digitalboard.ch` as
`@localadmin:digitalboard.ch`.
3. Create users via the Admin Panel or via MAS:
```bash
kubectl -n ess exec -it deploy/ess-matrix-authentication-service -- \
mas-cli manage register-user
```
4. If OIDC is enabled, users can also log in directly via Keycloak from
the Element Web client.
---
## Operations
- **Re-deploy / config change**: re-run the playbook. `kubernetes.core.helm`
performs `helm upgrade --install` — idempotent.
- **Upgrade chart version**: bump `ess_pro_chart_version`, re-run.
- **Rotate the Element token**: update it in OpenBao, re-run the
playbook. The role re-creates the image pull secret and re-authenticates
the Helm CLI.
- **Rendered values.yaml** on the host: `/etc/ess/values.yaml`.
- **Tear down**:
```bash
helm uninstall -n ess ess && kubectl delete ns ess
```
---
## Known caveats
- The bundled in-cluster Postgres is **not for production** — point at
an external Postgres VM before going live.
- TLS termination on the DMZ Traefik means well-known delegation and
Element Call ICE rely on the upstream proxy sending correct
`X-Forwarded-Proto`. Synapse is configured with `x_forwarded: true`;
verify with `curl https://digitalboard.ch/.well-known/matrix/server`.
- ESS Pro Helm chart field names track upstream — if a future chart
version renames a field (e.g. `matrixRTC.sfu.additional`), update
`templates/values.yaml.j2` accordingly. Run
`helm show values oci://registry.element.io/matrix-stack` after
major upgrades.
- The `serverName` is **immutable** after first deploy.