digitalboard.core/roles/ess_pro_compose/README.md
Tobias Wüst 32eca6b923 feat(ess-pro/compose): deploy Element Server Suite Pro via Compose
initial commit of the converted role from helm charts for qubernetis to compose ansible role
2026-06-04 10:52:05 +02:00

11 KiB
Raw Blame History

Ansible Role: ess_pro_compose

Deploys the full Element Server Suite Pro v26.5.1 stack as a single docker compose project, modelled 1:1 on the official matrix-stack Helm chart from Element. Fronted by the existing DMZ Traefik, secrets sourced from OpenBao (plus locally-generated cryptographic material), same conventions as the other digitalboard.core roles.

Licensing note: ESS Pro is distributed as a Helm/Kubernetes product. Running the Pro images under docker compose requires explicit vendor agreement, which is in place for this deployment.

Architecture

12 services, mirroring the chart:

                                            ┌───────────────┐
              ┌──────────────────────HTTP──▶│   element-web │
              │                             └───────────────┘
              │                             ┌───────────────┐
              │  ┌──────────────────HTTP──▶│ element-admin │
              │  │                          └───────────────┘
              │  │                          ┌───────────────┐
              │  │  ┌───────────────HTTP──▶│     mas       │ ─┐
DMZ Traefik ──┤  │  │                       └───────────────┘  │  ┌──────────┐
              │  │  │                       ┌───────────────┐  ├─▶│ postgres │
              │  │  │  ┌────────────HTTP──▶│   haproxy     │  │  └──────────┘
              │  │  │  │                    │  (Pro Image)  │  │  ┌──────────┐
              │  │  │  │                    └───┬─────────┬─┘  │  │   redis  │
              │  │  │  │                        │         │    │  └──────────┘
              │  │  │  │      ┌─────────────────┘         │    │
              │  │  │  │      ▼                           ▼    │
              │  │  │  │  ┌──────────────┐    ┌────────────────┴───────┐
              │  │  │  │  │ synapse-main │◀──▶│ synapse-fed-reader-0..N│
              │  │  │  │  │  (Python)    │    │  (Rust Pro worker)     │
              │  │  │  │  └──────────────┘    └────────────────────────┘
              │  │  │  │
              │  │  │  └──HTTP(/.well-known)──▶ haproxy (same instance)
              │  │  │
              │  │  └─────HTTP(/sfu/get)──────▶┌──────────────────┐
              │  │                              │ matrix-rtc-auth │ (lk-jwt)
              │  │                              └──────────┬───────┘
              │  └─HTTP+TCP/30001+UDP/30002───▶┌──────────▼───────┐
              │                                 │  matrix-rtc-sfu │ (LiveKit)
              │                                 └──────────────────┘
              │
              └─ HTTPS termination on Traefik, plain HTTP downstream

Hostnames

Component Hostname
Matrix serverName digitalboard.ch
Synapse (via HAProxy) matrix.digitalboard.ch
MAS account.digitalboard.ch
Element Web chat.digitalboard.ch
Element Admin admin.digitalboard.ch
Matrix RTC / Element Call mrtc.digitalboard.ch
.well-known/matrix/ digitalboard.ch (apex)

Naming follows Element's official docs (account.*, mrtc.*). Keycloak on auth.digitalboard.ch is untouched.

Prerequisites

  1. Collections on the control node:

    ansible-galaxy collection install community.docker community.hashi_vault
    pip install docker hvac
    
  2. Target host: Debian bookworm with Docker CE + compose plugin (the shared digitalboard docker role handles this) and python3-cryptography.

  3. DMZ Traefik attached to the proxy network with a websecure entrypoint and a letsencrypt certresolver.

  4. DNS A/AAAA records for the apex + five subdomains.

  5. DMZ firewall NAT-forwards TCP/30001 and UDP/30002 to the host (Element Call media ports — fixed by the chart, not the wide 50k60k range).

  6. ESS Pro registry credentials (and Authentik OIDC client secret) bootstrapped in OpenBao at kv/digitalboard/ess-compose via examples/openbao-bootstrap.sh.

How secrets work

Two layers:

  • From OpenBao: Element registry username/token and Authentik OIDC client secret. Pulled at playbook time via community.hashi_vault.vault_kv2_get lookups, same pattern as the other digitalboard.core roles.

  • Generated locally: The 14 cryptographic secrets the chart's init-secrets job normally produces (Synapse signing key, MAS RSA/ECDSA keys, Synapse↔MAS shared secret, replication secret, Postgres passwords, LiveKit secret, admin user password). A Python script bundled with the role generates them on first run into /opt/ess/secrets/ and never overwrites existing files — runs of the playbook are idempotent. All containers mount this directory read-only as /secrets/ess-generated/ (matches the chart's mount path).

The MAS RSA key is generated in DER PKCS8 format, ECDSA in PEM PKCS8, and the Synapse signing key in Synapse's native ed25519 <keyid> <base64> format. All formats verified against what the chart's matrix-tools generate-secrets produces.

Usage

# site.yml
- hosts: ess_servers
  become: true
  roles:
    - digitalboard.core.ess_pro_compose
# inventory/group_vars/ess_servers.yml  -- see examples/
ess_server_name: "digitalboard.ch"
ess_synapse_fed_reader_replicas: 5
ess_oidc_enabled: true
ess_oidc_issuer: "https://authentik.digitalboard.ch/application/o/ess/"
ess_rtc_external_ip: "203.0.113.42"

ess_registry_username: "{{ lookup('community.hashi_vault.vault_kv2_get', ...).data.data.registry_username }}"
ess_registry_token:    "{{ lookup('community.hashi_vault.vault_kv2_get', ...).data.data.registry_token }}"
ess_oidc_client_secret: "{{ lookup('community.hashi_vault.vault_kv2_get', ...).data.data.oidc_client_secret }}"

Run: ansible-playbook -i inventories/digitalboard site.yml

The role creates @localadmin:digitalboard.ch via mas-cli and prints the location of the generated password (/opt/ess/secrets/ADMIN_USER_PASSWORD on the host).

Post-deploy verification

# All containers healthy
docker compose -f /opt/ess/compose.yml ps

# Synapse + MAS<-->Synapse wiring
curl -sS https://matrix.digitalboard.ch/_matrix/client/versions | jq .versions
curl -sS https://digitalboard.ch/.well-known/matrix/server | jq
curl -sS https://digitalboard.ch/.well-known/matrix/client | jq

# MAS sanity
docker compose -f /opt/ess/compose.yml exec mas \
  mas-cli --config /conf/mas-config.yaml doctor

# HAProxy stats (internally)
docker compose -f /opt/ess/compose.yml exec haproxy \
  wget -qO- http://localhost:8405/metrics | head

Operations

  • Config change: re-run the playbook. Changed templates trigger per-component docker compose restart via handlers.
  • Image upgrade: bump ess_images.<component> in defaults or group_vars, re-run.
  • Scale federation-reader: change ess_synapse_fed_reader_replicas, re-run (HAProxy backend list is rendered from the same variable).
  • Logs: docker compose -f /opt/ess/compose.yml logs -f synapse-main
  • Tear down: docker compose -f /opt/ess/compose.yml down -v

What's faithful to the chart, what's adapted

Faithful to chart v26.5.1:

  • All image paths from registry.element.io (correct repos: synapse-onprem, synapse-pro-worker, matrix-authentication-service, element-web-pro, element-admin, haproxy, livekit-server-distroless, lk-jwt-service, postgres, redis-distroless).
  • HAProxy config 1:1 from the chart (path-based routing to fed-reader for /event, /state, /state_ids, admin IP allow-list, well-known serving on port 8010, 429.http for queue overflow).
  • Synapse homeserver.yaml merged from the chart's four fragments (underrides + overrides + main listeners + log config) with both Pro modules loaded (synapse_ess_pro.EssPro, synapse_mass_local_room_upgrades.MassLocalRoomUpgradesModule).
  • MAS config with all four listeners (web 8080, internal 8081, root 8082, synapse 8083) and kind: synapse_modern for delegated auth.
  • federation-reader (Rust worker) config in its native schema, not Synapse-Python-worker syntax.
  • LiveKit on TCP 30001 + UDP 30002 muxed, with node_ip set for ICE.
  • Element Web config with Pro features (use_exclusively, element-pro mobile variant).
  • Init-secrets bundle generated with matching key types and formats (rand32 url-safe / hex32 / rsa:4096:der / ecdsaprime256v1 PEM / Synapse ed25519 signing key).

Adapted for compose:

  • K8s DNS-SRV service discovery (_synapse-http._tcp.X.svc.cluster.local) replaced with direct compose service names + the embedded DNS resolver (127.0.0.11:53). HAProxy backend entries use plain hostnames.
  • StatefulSet PVCs replaced with named docker volumes.
  • The chart's matrix-tools render-config init-container is replaced by Ansible Jinja2 template rendering on the control node — same merge order, no Python interpreter in init-containers.
  • The chart's init-secrets K8s job is replaced by the local generate-secrets script.
  • Postgres postgres-ess-updater sidecar (which re-runs the init script in case of password changes) is omitted; first-boot init via /docker-entrypoint-initdb.d/ is sufficient for compose, since the generated passwords don't rotate on re-run (idempotent secrets).
  • No Synapse Pro autoscaler (K8s HPA only); replica count is static via ess_synapse_fed_reader_replicas.

Things not yet wired (optional Pro components)

The chart can also deploy these — not included in this role's first pass, add as needed:

  • Hookshot (Matrix bot framework for GitHub/GitLab/JIRA bridges)
  • Secure Border Gateway (Federation app-firewall — only relevant if you federate with strict-control orgs / German TI-Messenger)
  • Advanced Identity Management (LDAP/SCIM provisioning)
  • AuditBot, AdminBot, supervision
  • Sygnal (mobile push gateway)
  • Telemetry service (chart deploys this by default; here it's optional)
  • Content scanner

Each maps to its own template directory in charts/matrix-stack/templates/ and can be added later as additional compose services.