From 2ba0c07cd310d0674924f5abd41512f26bc36424 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Simon=20B=C3=A4rlocher?= Date: Wed, 27 May 2026 18:08:52 +0200 Subject: [PATCH] docs(reference-ansible): add docs/ tree and document repo, playbooks, Makefile Addresses the WKS PoC review (Notion 2026-05-26). All docs in English. - README: purpose, docs table of contents, annotated repo tree - docs/getting_started.md: prerequisites (WKS account, OIDC, SSH, VPN) + first deploy - docs/ansible.md: playbook table, "Running Ansible", service parameters, cheatsheet - docs/secrets.md: canonical Bao login (moved out of README) + demo defaults - docs/operations.md: full Makefile reference - docs/inventories.md: repo layout, topology, standard folder structure, walkthrough - docs/testing.md: static checks, inventory resolution, smoke test / dry run - remove ARCHITECTURE.md (architecture docs live externally) Also includes the gymburgdorf inventory build-out (bookstack, homarr, opnform, send) and scripts/bao-seed.sh. site.yml keeps a third traefik play (traefik_servers minus the vagrant _dmz/_backend split) so the demo inventories still configure their reverse proxy after the rebase onto main. --- .gitignore | 4 + ARCHITECTURE.md | 453 ------------------ Makefile | 12 + README.md | 132 +++-- docs/README.md | 29 ++ docs/ansible.md | 209 ++++++++ docs/getting_started.md | 93 ++++ docs/inventories.md | 179 +++++++ docs/operations.md | 136 ++++++ docs/secrets.md | 99 ++++ docs/testing.md | 93 ++++ .../host_vars/application/authentik.yml | 96 +++- .../host_vars/application/bookstack.yml | 43 ++ .../host_vars/application/drawio.yml | 17 + .../host_vars/application/homarr.yml | 71 +++ .../host_vars/application/nextcloud.yml | 33 +- .../host_vars/application/opnform.yml | 60 +++ .../host_vars/application/send.yml | 8 + .../host_vars/application/traefik.yml | 25 +- .../host_vars/storage/garage.yml | 12 +- .../host_vars/storage/traefik.yml | 8 + inventories/demo-gymburgdorf/hosts.yml | 16 + playbooks/site.yml | 47 +- scripts/bao-seed.sh | 191 ++++++++ 24 files changed, 1541 insertions(+), 525 deletions(-) delete mode 100644 ARCHITECTURE.md create mode 100644 docs/README.md create mode 100644 docs/ansible.md create mode 100644 docs/getting_started.md create mode 100644 docs/inventories.md create mode 100644 docs/operations.md create mode 100644 docs/secrets.md create mode 100644 docs/testing.md create mode 100644 inventories/demo-gymburgdorf/host_vars/application/bookstack.yml create mode 100644 inventories/demo-gymburgdorf/host_vars/application/homarr.yml create mode 100644 inventories/demo-gymburgdorf/host_vars/application/opnform.yml create mode 100644 inventories/demo-gymburgdorf/host_vars/application/send.yml create mode 100755 scripts/bao-seed.sh diff --git a/.gitignore b/.gitignore index 85cd2b2..e7b512a 100644 --- a/.gitignore +++ b/.gitignore @@ -12,6 +12,10 @@ .snapshots/* # Idea /.idea/ +# macOS +.DS_Store +# Claude Code local settings +.claude/ # Ansible /collections/ansible_collections/ .vagrant/ diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md deleted file mode 100644 index a98825c..0000000 --- a/ARCHITECTURE.md +++ /dev/null @@ -1,453 +0,0 @@ - -# Architektur — `reference-ansible` - -Dieses Dokument beschreibt die Architektur des Repos `reference-ansible` -und nutzt die Inventory `inventories/demo-gymburgdorf/` als -durchgehendes Beispiel. Es dient sowohl als Onboarding-Doku für neue -Engineers als auch als Referenz beim Aufsetzen weiterer Demo-Mandanten. - -> **Demo-only.** Alle Defaults in den Rollen (Passwörter, Tokens, -> RPC-Secrets) sind unsicher und ausschliesslich für Demo-Setups -> gedacht. Siehe [§ 8 — Security und Demo-Only-Defaults](#8-security-und-demo-only-defaults). - -**Letzte Aktualisierung:** 2026-05-18 · **Owner:** @sbaerlocher - -## Inhalt - -- [§ 0 — Glossar](#0-glossar) -- [§ 1 — Repo-Layout und Roles-Herkunft](#1-repo-layout-und-roles-herkunft) -- [§ 2 — Setup und Voraussetzungen](#2-setup-und-voraussetzungen) -- [§ 3 — Variablen-Hierarchie](#3-variablen-hierarchie) -- [§ 4 — Inventory-Topologie (`demo-gymburgdorf`)](#4-inventory-topologie-demo-gymburgdorf) -- [§ 5 — Service-Layout und Variablen-Verortung](#5-service-layout-und-variablen-verortung) -- [§ 6 — Deploy-Flow](#6-deploy-flow) -- [§ 7 — Traefik-Modi (DMZ vs Backend)](#7-traefik-modi-dmz-vs-backend) -- [§ 8 — Security und Demo-Only-Defaults](#8-security-und-demo-only-defaults) -- [§ 9 — Variablen-Cheatsheet](#9-variablen-cheatsheet) -- [§ 10 — Walkthrough: Neuen Demo-Mandanten anlegen](#10-walkthrough-neuen-demo-mandanten-anlegen) -- [§ 11 — Bekannte Lücken und Trade-offs](#11-bekannte-lücken-und-trade-offs) - -## 0. Glossar - -| Begriff | Bedeutung | -|---|---| -| **OpenBao** | HashiCorp-Vault-Fork. Single Source of Truth für Secrets. Endpoint: `bao.digitalboard.ch`. | -| **Authentik** | Identity Provider. Stellt OIDC für SP-Services und LDAP via Outpost. | -| **Outpost (Authentik)** | Separater Authentik-Sidecar, der LDAP/Proxy-Protokolle für Legacy-Apps emuliert. Spricht via RPC + Token zu Authentik. | -| **WOPI** | Web Application Open Platform Interface — Protokoll, mit dem Nextcloud/Opencloud Office-Dokumente an Collabora übergeben. | -| **TSIG / RFC2136** | Authenticated DNS-Updates. Traefik nutzt TSIG-signierte `nsupdate`-Calls für ACME DNS-01-Challenges. | -| **DNS-01 (ACME)** | Let's-Encrypt-Challenge-Typ: Zertifikatsbesitz wird per TXT-Record im DNS bewiesen statt per HTTP. Erforderlich für Wildcard-Certs. | -| **CNAME-Bridge** | `_acme-challenge.` zeigt per CNAME in eine dedizierte Update-Zone (`demo-gymb._acme.digitalboard.ch`). So bleibt der TSIG-Key auf eine schmale Zone beschränkt. | -| **File-Provider / Docker-Provider** | Traefik-Konfigurationsquellen. File-Provider liest statische YAML, Docker-Provider liest Container-Labels via `/var/run/docker.sock`. | -| **STUN/TURN** | NAT-Traversal-Protokolle für WebRTC (z. B. für Nextcloud Talk). Läuft auf separatem Host (`turn`). | -| **Garage** | S3-kompatibler Object Store (Rust). Backend für Nextcloud/Opencloud. | -| **FQCN** | Fully Qualified Collection Name, z. B. `digitalboard.core.traefik`. Ansible-Pflicht ab 2.10. | - -## 1. Repo-Layout und Roles-Herkunft - -```text -reference-ansible/ -├── Makefile # Deploy-Targets, OIDC-Login, OBJC-Fork-Workaround -├── ansible.cfg # collections_path, remote_user=root, hashi_vault auth_method=token -├── requirements.yml # community.hashi_vault + digitalboard.core (Git) -├── playbooks/site.yml # Play-Sequenz (14 Plays, siehe § 6) -├── collections/ # ← installiert von `make install`, gitignored -│ └── ansible_collections/ -│ └── digitalboard/core/ -│ └── roles/ # 🔑 HIER liegen die Rollen, NICHT im Repo-Root -└── inventories/ - ├── demo-gymburgdorf/ # Inventory dieses Dokuments - ├── demo-mbazürich/ - ├── demo-phbern/ - └── vagrant/ # lokale Test-Inventory mit eigener Topologie -``` - -> **Wichtig:** Es gibt **kein** `roles/`-Verzeichnis im Repo-Root. Alle -> Rollen kommen aus der Collection `digitalboard.core` (siehe -> [requirements.yml](requirements.yml)), installiert via `make install` -> nach `./collections/`. Plays referenzieren sie per FQCN -> `digitalboard.core.`. - -## 2. Setup und Voraussetzungen - -**Tools auf dem Control-Node:** - -- `ansible` (Core ≥ 2.15) -- `bao` CLI (OpenBao) — z. B. `sudo pacman -S openbao python-hvac` (Arch) oder Homebrew -- `python-hvac` (für `community.hashi_vault` Lookups) -- Auf macOS: `OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES` (im [Makefile](Makefile) gesetzt; ohne crashen Ansible-Forks beim Bao-Lookup) - -**Initial-Setup:** - -```bash -git clone -cd reference-ansible -make install # Galaxy + digitalboard.core nach ./collections/ -``` - -**Vor jedem Deploy:** Bao-Login in **derselben Shell**, in der dann `ansible-playbook` läuft: - -```bash -export BAO_ADDR=https://bao.digitalboard.ch -bao login -method=oidc -path=Digitalboard -export VAULT_TOKEN=$(bao print token) -``` - -> ⚠️ `make bao` allein reicht **nicht**: jedes `make`-Target startet -> eine neue Shell, der dort gesetzte `VAULT_TOKEN` lebt nur -> während `make bao` selbst. Entweder die drei Befehle oben manuell -> ausführen oder `make bao deploy_site_demo_gymburgdorf` als **einen** -> Aufruf — sonst hat das Deploy keinen Token. - -**Smoke-Test:** - -```bash -make ping_demo # pingt alle drei Demo-Inventories -``` - -## 3. Variablen-Hierarchie - -Ansible mergt Variablen über mehrere Quellen. Vereinfachtes Modell für -dieses Repo (vollständige Precedence siehe Ansible-Docs): - -```mermaid -flowchart LR - classDef role fill:#fef3c7,stroke:#92400e,color:#000 - classDef group fill:#dbeafe,stroke:#1e40af,color:#000 - classDef host fill:#dcfce7,stroke:#166534,color:#000 - classDef vault fill:#fee2e2,stroke:#991b1b,color:#000 - - R["role defaults
(niedrigste Precedence)
collections/.../roles/<r>/defaults/main.yml"]:::role - GA["group_vars/all/
vault.yml, docker.yml"]:::group - GG["group_vars/<group>/
traefik_servers/, backend_servers/
(parallele Gruppen, gemerged via
ansible_group_priority)"]:::group - HV["host_vars/<host>/
(höchste der drei Inventory-Quellen)"]:::host - BAO["OpenBao
Lookup zur Laufzeit"]:::vault - - R --> |"<wird überschrieben von>"| GA - GA --> |"<wird überschrieben von>"| GG - GG --> |"<wird überschrieben von>"| HV - HV -.community.hashi_vault.-> BAO - GG -.community.hashi_vault.-> BAO -``` - -**Wichtige Eigenschaften:** - -- Mehrere `group_vars//` sind **parallel**, nicht hierarchisch - geschachtelt. `traefik_servers` und `backend_servers` werden nach - `ansible_group_priority` (Default 1) gemerged; bei Konflikt gewinnt - alphabetisch der spätere Gruppenname. -- `host_vars//` schlägt jede Gruppe. -- `host_vars/reverseproxy/traefik.yml: traefik_mode: dmz` überschreibt - daher den Default aus `group_vars/backend_servers/` — und zwar nur, - weil `reverseproxy` nicht Mitglied von `backend_servers` ist (sonst - ginge es sowieso nicht). - -**Bao-Lookups** sind keine Precedence-Ebene, sondern **Werte** innerhalb -einer beliebigen Var-Quelle. Pattern siehe [§ 8](#8-security-und-demo-only-defaults). - -## 4. Inventory-Topologie (`demo-gymburgdorf`) - -```mermaid -flowchart LR - classDef dmz fill:#fee2e2,stroke:#991b1b,color:#000 - classDef app fill:#dcfce7,stroke:#166534,color:#000 - classDef stor fill:#dbeafe,stroke:#1e40af,color:#000 - classDef turn fill:#fef9c3,stroke:#854d0e,color:#000 - - subgraph ALL["group: all_servers"] - direction LR - subgraph DMZ["DMZ 172.16.9.0/24"] - RP["reverseproxy
172.16.9.111
traefik_mode: dmz"]:::dmz - TURN["turn
172.16.9.112
(noch keine Rolle in site.yml)"]:::turn - end - subgraph BE["Backend 172.16.19.0/24
group: backend_servers"] - APP["application
172.16.19.101
traefik_mode: backend
+ authentik, authentik_outpost_ldap,
nextcloud, collabora, drawio"]:::app - ST["storage
172.16.19.102
traefik_mode: backend
+ garage (S3)"]:::stor - end - end - - RP -.HTTPS in, HTTP out.-> APP - RP -.HTTPS in, HTTP out.-> ST -``` - -**Gruppen-Mitgliedschaften (aus [hosts.yml](inventories/demo-gymburgdorf/hosts.yml)):** - -| Gruppe | Mitglieder | Zweck | -|---|---|---| -| `all_servers` | `reverseproxy`, `application`, `storage`, `turn` | Basis-Rolle für alle Hosts | -| `traefik_servers` | `children: all_servers` (= alle 4 Hosts) | Traefik überall; DMZ/Backend per `traefik_mode` | -| `backend_servers` | `application`, `storage` | setzt `traefik_mode: backend` per group_var | -| `garage_servers` | `storage` | Single-Host-Wrapper für Garage-Role | -| `nextcloud_servers`, `collabora_servers`, `drawio_servers`, `authentik_servers`, `authentik_outpost_ldap_servers` | je nur `application` | Single-Host-Wrapper | - -> **Unterschied zur `vagrant`-Inventory:** `vagrant` strukturiert -> Traefik anders — über die Children-Gruppen `traefik_servers_dmz` und -> `traefik_servers_backend` statt über `backend_servers` + -> `host_vars`-Override. Die beiden Topologien sind **strukturell -> inkompatibel**; ein 1:1-Mapping geht nicht. Siehe [§ 10](#10-walkthrough-neuen-demo-mandanten-anlegen) -> für die empfohlene Vorlage. - -## 5. Service-Layout und Variablen-Verortung - -```mermaid -flowchart TB - classDef rp fill:#fee2e2,stroke:#991b1b,color:#000 - classDef ap fill:#dcfce7,stroke:#166534,color:#000 - classDef st fill:#dbeafe,stroke:#1e40af,color:#000 - classDef ext fill:#e9d5ff,stroke:#6b21a8,color:#000 - - Internet((Internet)) - DNS["DNS ns1.digitalboard.ch
RFC2136 TSIG
Zone: demo-gymb._acme.digitalboard.ch
CNAME-Bridge: _acme-challenge.*.gymb.souveredu.ch"]:::ext - BAO["OpenBao
bao.digitalboard.ch
mount: demo-gymburgdorf"]:::ext - - subgraph RP["reverseproxy — traefik dmz"] - TRDMZ["traefik (file provider)
📍 group_vars/traefik_servers/traefik.yml
📍 host_vars/reverseproxy/traefik.yml
→ traefik_mode: dmz
→ traefik_dmz_exposed_services"]:::rp - end - - subgraph APP["application — traefik backend"] - TRA["traefik (docker provider)
📍 group_vars/backend_servers/traefik.yml"]:::ap - AK["authentik (OIDC + LDAP-Outpost-Backend)
📍 host_vars/application/authentik.yml"]:::ap - AKO["authentik_outpost_ldap
📍 host_vars/application/authentik_outpost_ldap.yml"]:::ap - NC["nextcloud
📍 host_vars/application/nextcloud.yml"]:::ap - COL["collabora
📍 host_vars/application/collabora.yml"]:::ap - DRW["drawio
📍 host_vars/application/drawio.yml"]:::ap - end - - subgraph ST["storage — traefik backend"] - TRS["traefik (docker provider)"]:::st - GAR["garage (S3)
📍 host_vars/storage/garage.yml"]:::st - end - - Internet -->|HTTPS :443| TRDMZ - TRDMZ -->|HTTP backend| TRA - TRDMZ -->|HTTP backend| TRS - TRA --> AK & AKO & NC & COL & DRW - TRS --> GAR - - NC -. S3 .-> GAR - NC -. OIDC .-> AK - NC -. WOPI .-> COL - NC -. LDAP .-> AKO - AKO -. RPC + token .-> AK - - TRDMZ -. ACME DNS-01 TSIG .-> DNS - TRDMZ -. hashi_vault acme-tsig .-> BAO - AK -. hashi_vault secrets .-> BAO - NC -. hashi_vault secrets .-> BAO - GAR -. hashi_vault secrets .-> BAO -``` - -> **Hinweis:** `opencloud`, `send` und `openforms` sind in -> [playbooks/site.yml](playbooks/site.yml) als Plays vorhanden, haben -> aber in `demo-gymburgdorf` aktuell keine entsprechende Gruppe in -> [hosts.yml](inventories/demo-gymburgdorf/hosts.yml) — die Plays -> laufen also durch, ohne Targets zu finden. Wenn diese Services für -> einen Mandanten gewünscht sind, jeweils `_servers`-Gruppe in -> `hosts.yml` und `host_vars/application/.yml` ergänzen. -> -> Der Host `turn` ist in `all_servers` und damit auch in -> `traefik_servers`, aber es gibt **keine** Service-Gruppe für ihn — -> aktuell läuft auf `turn` nur die `base`- und `traefik`-Rolle. - -## 6. Deploy-Flow - -Reihenfolge aus [playbooks/site.yml](playbooks/site.yml): - -```mermaid -sequenceDiagram - participant U as User - participant A as ansible-playbook - participant V as OpenBao - participant H as Hosts - - U->>U: bao login + export VAULT_TOKEN - U->>A: make deploy_site_demo_gymburgdorf - A->>A: lade vars: role defaults → group_vars/all → group_vars/<groups> → host_vars/<host> - A->>V: community.hashi_vault Lookups
(acme-tsig, service-secrets) - V-->>A: secret values - A->>H: Play 1 — base (alle Hosts) - A->>H: Play 2 — traefik (alle Hosts: dmz auf reverseproxy, backend sonst) - A->>H: Play 3 — httpbin - A->>H: Play 4 — 389ds - A->>H: Play 5 — keycloak - A->>H: Play 6 — garage (storage) - A->>H: Play 7 — collabora (application) - A->>H: Play 8 — authentik (application) - A->>H: Play 9 — authentik_outpost_ldap (application) - A->>H: Play 10 — nextcloud (application) - A->>H: Play 11 — drawio (application) - A->>H: Play 12 — send - A->>H: Play 13 — openforms - A->>H: Play 14 — opencloud -``` - -Plays ohne passende Gruppen-Mitglieder (`httpbin_servers`, `ds389_servers`, -`keycloak_servers`, `send_servers`, `openforms_servers`, -`opencloud_servers` in dieser Inventory) laufen no-op durch. - -`--diff` ist im Target gesetzt → Änderungen pro Task sichtbar. - -## 7. Traefik-Modi (DMZ vs Backend) - -**`traefik_mode: dmz`** — public-facing Reverse Proxy auf `reverseproxy`: - -- **file provider** mit `services.yml` für statisches Routing. -- Kein Docker-Socket gemountet, keine lokalen Container. -- Routet zu `backend_host`-Adressen anderer Maschinen. -- Backends werden über `traefik_dmz_exposed_services` (Liste in - `host_vars/reverseproxy/`) deklariert. Selektive Backend-Auswahl - zusätzlich über `traefik_backend_servers_to_proxy`. - -**`traefik_mode: backend`** — application/storage: - -- Mountet `/var/run/docker.sock`. -- **docker provider**: Auto-Discovery via Container-Labels (`traefik.enable=true`). -- Services werden lokal exponiert; die DMZ-Traefik routet von aussen - dorthin (Klartext-HTTP, siehe [§ 8](#8-security-und-demo-only-defaults)). - -**Beide Modi** unterstützen ACME via RFC2136 DNS Challenge oder -Self-Signed (`traefik_cert_mode: acme | selfsigned`). - -## 8. Security und Demo-Only-Defaults - -> Dieses Repo ist explizit für **Demo-Setups** gedacht. Alle Default- -> Werte in den Rollen sind unsicher und werden in `demo-*`-Inventories -> über Bao-Lookups oder host_vars überschrieben. Für Prod-Deployments -> gilt zusätzlich der Härtungs-Block weiter unten. - -### Secret-Pattern (Bao-Lookup) - -```yaml -# group_vars/.../.yml oder host_vars/.../.yml -authentik_secret_key: "{{ lookup('community.hashi_vault.hashi_vault', - vault_mount + '/data/authentik:secret_key', - url=vault_addr) }}" -``` - -- `vault_mount` und `vault_addr` aus - [group_vars/all/vault.yml](inventories/demo-gymburgdorf/group_vars/all/vault.yml). -- KV-v2-Pfade brauchen explizit `/data/` im Pfad — Ansible löst das - nicht selber auf. -- `vault_mount` ist pro Inventory eindeutig (`demo-gymburgdorf`, - `demo-phbern`, …) → Mandant-Isolation in Bao via Mount + Policy. - -### Demo-Only-Defaults — Override-Pflicht - -Diese Defaults in `digitalboard.core` sind unsicher. In jeder -**Prod-tauglichen** Deployment müssen sie via Bao-Lookup oder -host_var überschrieben werden: - -| Variable | Default | Wo überschreiben | -|---|---|---| -| `keycloak_admin_password` | `changeme` | host_vars `keycloak_servers` | -| `keycloak_postgres_password` | `changeme` | dito | -| `authentik_secret_key` | `changeme-generate-a-random-string` | `host_vars/application/authentik.yml` | -| `authentik_postgres_password` | `changeme` | dito | -| `nextcloud_admin_password` | `admin` | `host_vars/application/nextcloud.yml` | -| `nextcloud_postgres_password` | `changeme` | dito | -| `nextcloud_s3_key` / `nextcloud_s3_secret` | `changeme` / `changeme` | dito | -| `garage_webui_password` | `admin` | `host_vars/storage/garage.yml` | -| `garage_rpc_secret` | `0123…cdef` (64-hex Konstante) | dito | -| `garage_admin_token` | identisch zu `rpc_secret` | dito | -| `garage_metrics_token` | identisch zu `rpc_secret` | dito | - -> **Konvention:** Jeder Wert, der oben in der Tabelle steht, **muss** -> in `demo-*/host_vars/.../...yml` einen Bao-Lookup haben, bevor das -> Inventory als deploy-fähig gilt. - -### Threat-Boundaries (Stand: Demo) - -| Boundary | Status | Notiz | -|---|---|---| -| DMZ ↔ Backend (172.16.9 ↔ 172.16.19) | **Klartext-HTTP** | Auth-Bearer, OIDC-Code, Session-Cookies reisen unverschlüsselt. Für Demo ok, für Prod: mTLS oder WireGuard-Overlay. | -| Host-Firewall | **fehlt** | Die `base`-Rolle installiert kein UFW/nftables. Segmentation hängt am Hypervisor/VLAN. | -| SSH | `ansible_user: root` | Kein Bastion, kein Jumphost. Key-Distribution out-of-band. | -| Authentik-SPOF | **akzeptiert** | IDP und SP-Services auf demselben Host (`application`). Authentik-Ausfall = Login-Ausfall inkl. LDAP-Outpost. Kein Break-Glass-Pfad. | -| ACME-TSIG-Key | Bao-Lookup | Eine TSIG-Key pro Demo-Zone (`acme_update_key_demo_gymb`), Zonen-isoliert. Rotation manuell. | -| Backup/DR | **out-of-scope** | Garage `replication_factor: 1` (default), kein Postgres-Backup-Job, kein Bao-Snapshot-Cron. | - -### Für Prod-Adaption ergänzen - -- Host-FW (`base`-Rolle erweitern oder eigene `firewall`-Rolle). -- mTLS oder WireGuard zwischen DMZ und Backend. -- Authentik auf separaten Host, mit Recovery-Admin-Token. -- Bao-Policies pro Inventory-Mount (read-only für Deploy-Token, - write-only für Bootstrap-Job). -- Backup-Cron für Postgres + Garage + Bao. -- SSH-Bastion + Key-Rotation. - -## 9. Variablen-Cheatsheet - -| Variable | Wohin in `demo-gymburgdorf/` | Warum | -|---|---|---| -| `vault_addr`, `vault_mount` | `group_vars/all/vault.yml` | Bao-Endpoint gilt site-weit | -| `docker_registry_mirrors` | `group_vars/all/docker.yml` | Pulls aus Mirror auf allen Hosts | -| `traefik_acme_*`, `traefik_use_ssl`, `traefik_cert_mode` | `group_vars/traefik_servers/traefik.yml` | Gilt für alle Traefik-Instanzen (dmz + backend) | -| `traefik_mode: backend` | `group_vars/backend_servers/traefik.yml` | Default für app + storage | -| `traefik_mode: dmz` | `host_vars/reverseproxy/traefik.yml` | Host-spezifischer Override | -| `traefik_dmz_exposed_services` | `host_vars/reverseproxy/` | Liste der DMZ-Backends — nur dort sinnvoll | -| `nextcloud_*`, `authentik_*`, `collabora_*`, `drawio_*` | `host_vars/application/.yml` | Service läuft auf `application` | -| `garage_*` | `host_vars/storage/garage.yml` | Service läuft auf `storage` | -| Secrets (Passwords, Tokens, Keys) | inline-Variable mit `lookup('community.hashi_vault.hashi_vault', …)` | Single source of truth via Bao | - -## 10. Walkthrough: Neuen Demo-Mandanten anlegen - -Empfohlene Vorlage: **`demo-gymburgdorf`** (nicht `vagrant`, weil die -Gruppen-Topologie inkompatibel ist). - -1. **Inventory kopieren:** - - ```bash - cp -r inventories/demo-gymburgdorf inventories/demo- - ``` - -2. **`hosts.yml` anpassen:** IPs, Hostnames pro Host. - -3. **`group_vars/all/vault.yml`** — `vault_mount` auf den neuen - Mandant-Mount setzen (`demo-`). - -4. **`group_vars/traefik_servers/traefik.yml`** — `traefik_acme_dns_zone` - und `traefik_acme_tsig_*`-Lookup-Pfade auf die neue Zone / - den neuen Bao-Pfad biegen. - -5. **`host_vars/application/*.yml`** und **`host_vars/storage/*.yml`** - durchgehen: FQDNs auf das neue Domain-Pattern (z. B. - `*..souveredu.ch`), Bao-Lookup-Pfade auf `demo-/data/…`. - -6. **OpenBao vorbereiten** (out-of-band, nicht via Ansible): - - Neuen KV-v2-Mount `demo-` anlegen. - - Secrets schreiben: `acme-tsig`, `authentik`, `nextcloud`, - `garage`, … (siehe [§ 8](#8-security-und-demo-only-defaults) - für die Override-Pflicht-Liste). - - Policy für den Deploy-Token: read auf `demo-/data/*`. - -7. **DNS:** TSIG-Update-Zone (`demo-._acme.digitalboard.ch`) bei - `ns1.digitalboard.ch` anlegen, CNAMEs `_acme-challenge.*..` - dorthin. - -8. **Makefile** — neues Target nach Vorbild von - `deploy_site_demo_gymburgdorf` ergänzen und in `deploy_site_demo` - einreihen. - -9. **Smoke-Test:** `ansible all -i inventories/demo-/hosts.yml -m ping`. - -10. **Deploy:** Bao-Login + `make deploy_site_demo_`. - -## 11. Bekannte Lücken und Trade-offs - -- **`opencloud`-Inventory in `demo-gymburgdorf`:** keine - `opencloud_servers`-Gruppe vorhanden. Wenn benötigt, ergänzen wie in - [§ 5](#5-service-layout-und-variablen-verortung) beschrieben. -- **`turn`-Host:** in DMZ definiert, aber keine STUN/TURN-Rolle in - [playbooks/site.yml](playbooks/site.yml). Wird aktuell nur per - `base`+`traefik` provisioniert. -- **Idempotenz:** Rollen sind Docker-Compose-basiert; Re-Runs führen - ggf. Container-Restarts aus, wenn sich Compose-Inputs ändern. Kein - spezieller Rollback-Mechanismus — bei Fehlschlag manuell auf den - vorigen Stand zurücksetzen. -- **TLS-Renewal:** Erfolgt durch Traefik intern via ACME. Kein - externer Renew-Cron im Repo. -- **CI/Testing:** Aktuell nicht im Repo. Smoke-Test via `make ping_demo`. -- **Logs:** Traefik läuft in `demo-gymburgdorf` und `vagrant` mit - `traefik_log_level: DEBUG` (Role-Default ist `INFO`) — vor - Prod-Adaption auf `INFO` oder `WARN` reduzieren. diff --git a/Makefile b/Makefile index ffa5afe..687a27f 100644 --- a/Makefile +++ b/Makefile @@ -10,6 +10,18 @@ bao: bao login -method=oidc -path=Digitalboard role=default $(eval export VAULT_TOKEN=$(shell bao print token)) +# Seed/merge OpenBao secrets for a demo inventory. Idempotent: existing +# keys are kept; only missing keys are generated. Pass DRY_RUN=1 to +# preview without writing. +seed_bao_gymburgdorf: + scripts/bao-seed.sh demo-gymburgdorf + +seed_bao_mbazürich: + scripts/bao-seed.sh demo-mbazürich + +seed_bao_phbern: + scripts/bao-seed.sh demo-phbern + ping_demo: echo "# pinging demo-gymburgdorf" ansible all -i inventories/demo-gymburgdorf/hosts.yml -m ping || true diff --git a/README.md b/README.md index fead38b..ee1a193 100644 --- a/README.md +++ b/README.md @@ -1,61 +1,117 @@ + # reference-ansible -Ansible-Setup für Demo-Deployments (`demo-gymburgdorf`, -`demo-mbazürich`, `demo-phbern`) und lokale Vagrant-Tests. Rollen -kommen aus der Collection +## Purpose + +This repository is the **Ansible setup for the Digitalboard demo +deployments** — reproducible reference tenants on which the full stack +(identity, storage, office, forms, dashboards, …) can be set up and +demonstrated end-to-end. + +It contains **no** roles of its own: all logic comes from the collection [`digitalboard.core`](https://git.digitalboard.ch/Digitalboard/digitalboard.core) -(via [requirements.yml](requirements.yml)). +(via [requirements.yml](requirements.yml)). What lives here is the +**inventory and configuration layer** — one dedicated inventory per +tenant under `inventories/demo-*` that wires up the roles, defines hosts +and domains, and pulls secrets from [OpenBao](https://openbao.org/). The +only playbook is [playbooks/site.yml](playbooks/site.yml). -> Architektur, Variablen-Hierarchie, Service-Topologie und der -> Walkthrough zum Aufsetzen neuer Mandanten: siehe -> **[ARCHITECTURE.md](ARCHITECTURE.md)**. +The repository thus serves two purposes: -## Voraussetzungen +- **Demo operation:** deploy and keep up to date the existing demo + tenants (`demo-gymburgdorf`, `demo-mbazürich`, `demo-phbern`). +- **Template:** create new tenants following the `demo-gymburgdorf` + pattern — see [docs/inventories.md § Walkthrough](docs/inventories.md#walkthrough-creating-a-new-demo-tenant). -- `ansible` (Core ≥ 2.15) -- `bao` CLI ([OpenBao](https://openbao.org/)) — z. B. - `sudo pacman -S openbao python-hvac` (Arch) oder Homebrew -- `python-hvac` (für `community.hashi_vault` Lookups) +> **Demo-only.** All role defaults (passwords, tokens, RPC secrets) are +> insecure and intended exclusively for demo setups. For production +> adaptation, see [docs/secrets.md](docs/secrets.md). -## Setup +## Documentation -```bash -make install # installiert digitalboard.core + community.hashi_vault nach ./collections/ +In-depth documentation lives in the [`docs/`](docs/) folder — start with +the index [docs/README.md](docs/README.md): + +| Document | Content | +| --- | --- | +| [docs/getting_started.md](docs/getting_started.md) | Prerequisites (access, tools), first deploy step by step | +| [docs/operations.md](docs/operations.md) | Setup, prerequisites, deploy flow, make targets, smoke test, known gaps | +| [docs/secrets.md](docs/secrets.md) | OpenBao login, secret lookup pattern, demo-only defaults, threat boundaries | +| [docs/inventories.md](docs/inventories.md) | Repository layout, roles origin, inventory topology, new-tenant walkthrough | +| [docs/ansible.md](docs/ansible.md) | Playbooks (`site.yml`), service parameters, variable cheat sheet | +| [docs/testing.md](docs/testing.md) | Static checks, inventory resolution, smoke test/dry run before the deploy | + +## Repository structure + +```text +reference-ansible/ +├── Makefile # deploy targets, OIDC login, OBJC fork workaround +├── ansible.cfg # collections_path, remote_user=root, hashi_vault auth +├── requirements.yml # community.hashi_vault + digitalboard.core (Git) +├── Vagrantfile # local test VMs +├── playbooks/ +│ └── site.yml # the only playbook — play sequence of all services +├── scripts/ +│ └── bao-seed.sh # seed/merge OpenBao secrets per inventory (idempotent) +├── docs/ # in-depth documentation (see table above) +├── collections/ # ← installed by `make install`, gitignored +│ └── ansible_collections/digitalboard/core/roles/ # 🔑 the roles live here +└── inventories/ + ├── demo-gymburgdorf/ # reference tenant — template for new tenants + │ ├── hosts.yml # hosts + group topology + │ ├── group_vars/ # all/ · traefik_servers/ · backend_servers/ + │ └── host_vars/ # reverseproxy/ · application/ · storage/ + ├── demo-mbazürich/ # demo tenant + ├── demo-phbern/ # demo tenant + └── vagrant/ # local test inventory (incompatible topology) ``` -## Secrets (OpenBao) +> **No `roles/` in the repository root** — all roles come from +> `digitalboard.core` and are installed via `make install` into +> `./collections/`. Plays reference them by FQCN +> `digitalboard.core.`. Details: +> [docs/inventories.md](docs/inventories.md#repo-layout-and-role-origin). -Vor jedem Deploy in **derselben Shell** authentisieren: +## Quick Start + +> **First time here?** Prerequisites (WKS account, OIDC access, SSH key, +> VPN), tool setup, and the first deploy step by step: +> **[docs/getting_started.md](docs/getting_started.md)**. ```bash +make install # collections into ./collections/ + +# Log in to OpenBao — in the SAME shell as the deploy. +# Full login flow + make-bao caveat: docs/secrets.md export BAO_ADDR=https://bao.digitalboard.ch bao login -method=oidc -path=Digitalboard export VAULT_TOKEN=$(bao print token) + +make ping_demo # smoke test against all demo inventories +make deploy_site_demo_gymburgdorf # single demo site +make deploy_site_demo # all three demo sites ``` -> ⚠️ `make bao` allein reicht **nicht** — jedes `make`-Target läuft in -> einer neuen Shell, der dort gesetzte `VAULT_TOKEN` lebt nur während -> `make bao` selbst. Entweder die drei Befehle oben manuell im Shell -> ausführen oder `make bao deploy_site_demo_gymburgdorf` als **einen** -> Aufruf chainen. +Login details and the `make bao` caveat: [docs/secrets.md](docs/secrets.md#openbao-login). +Prerequisites, all make targets, and the deploy flow: +[docs/operations.md](docs/operations.md). Invoking Ansible directly +(`--limit`, `--check`): [docs/ansible.md § Running Ansible](docs/ansible.md#running-ansible). -## Deploy +## Available playbooks -```bash -make ping_demo # Smoke-Test gegen alle Demo-Inventories -make deploy_site_demo_gymburgdorf # einzelnes Demo-Site -make deploy_site_demo # alle drei Demo-Sites -``` - -Auf macOS setzt das [Makefile](Makefile) zusätzlich -`OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES` — ohne diese Env-Var -crashen Ansible-Forks beim ersten `community.hashi_vault`-Lookup. +The only playbook is [playbooks/site.yml](playbooks/site.yml) — a +sequence of plays, each applying one `digitalboard.core` role to a host +group (base, traefik, garage, authentik, authentik_outpost_ldap, +nextcloud, collabora, drawio, send, opnform, homarr, bookstack, …). +Which plays take effect in an inventory is governed solely by group +membership in `hosts.yml`. Full play table and all service parameters: +**[docs/ansible.md](docs/ansible.md)**. ## Inventories -| Inventory | Zweck | -|---|---| -| [`inventories/demo-gymburgdorf/`](inventories/demo-gymburgdorf/) | Demo-Mandant — als Vorlage für neue Mandanten empfohlen, siehe [ARCHITECTURE.md § 10](ARCHITECTURE.md#10-walkthrough-neuen-demo-mandanten-anlegen) | -| [`inventories/demo-mbazürich/`](inventories/demo-mbazürich/) | Demo-Mandant | -| [`inventories/demo-phbern/`](inventories/demo-phbern/) | Demo-Mandant | -| [`inventories/vagrant/`](inventories/vagrant/) | lokale Test-VMs; **inkompatible Gruppen-Topologie** zu den Demo-Inventories | +| Inventory | Purpose | +| --- | --- | +| [`inventories/demo-gymburgdorf/`](inventories/demo-gymburgdorf/) | Demo tenant — recommended as a template for new tenants, see [docs/inventories.md](docs/inventories.md#walkthrough-creating-a-new-demo-tenant) | +| [`inventories/demo-mbazürich/`](inventories/demo-mbazürich/) | Demo tenant | +| [`inventories/demo-phbern/`](inventories/demo-phbern/) | Demo tenant | +| [`inventories/vagrant/`](inventories/vagrant/) | Local test VMs; **incompatible group topology** compared to the demo inventories | diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000..4071672 --- /dev/null +++ b/docs/README.md @@ -0,0 +1,29 @@ + +# Documentation — `reference-ansible` + +Entry point for this repository's in-depth documentation. The +[`demo-gymburgdorf`](../inventories/demo-gymburgdorf/) inventory serves +as a running example throughout. + +> **Demo-only.** All role defaults (passwords, tokens, RPC secrets) are +> insecure and intended exclusively for demo setups. See +> [secrets.md § Demo-Only-Defaults](secrets.md#demo-only-defaults--must-be-overridden). + +## Table of contents + +| Document | Content | +| --- | --- | +| [getting_started.md](getting_started.md) | Prerequisites (access, tools), first deploy step by step | +| [operations.md](operations.md) | Setup, prerequisites, deploy flow, smoke test, known gaps | +| [secrets.md](secrets.md) | OpenBao login, secret lookup pattern, demo-only defaults, threat boundaries | +| [inventories.md](inventories.md) | Repository layout, roles origin, inventory topology, new-tenant walkthrough | +| [ansible.md](ansible.md) | Playbooks (`site.yml`), per-service parameters, variable cheat sheet | +| [testing.md](testing.md) | Static checks, inventory resolution, smoke test/dry run before the deploy | + +## Quick links + +- **First time here?** → [getting_started.md](getting_started.md) +- **Create a new tenant** → [inventories.md § Walkthrough](inventories.md#walkthrough-creating-a-new-demo-tenant) +- **Which variable goes where?** → [ansible.md § Variable cheat sheet](ansible.md#variable-cheatsheet) +- **Store a secret in Bao** → [secrets.md § Secret pattern](secrets.md#secret-pattern-bao-lookup) +- **Run a deploy** → [operations.md § Deploy](operations.md#deploy) diff --git a/docs/ansible.md b/docs/ansible.md new file mode 100644 index 0000000..2f663fe --- /dev/null +++ b/docs/ansible.md @@ -0,0 +1,209 @@ + +# Playbooks & Parameters + +[← Documentation index](README.md) + +Central reference: which plays [playbooks/site.yml](../playbooks/site.yml) +runs, which service parameters are relevant per role, and where they are +located in the inventory. Example used throughout: `demo-gymburgdorf`. + +## Playbook `site.yml` + +The only playbook is [playbooks/site.yml](../playbooks/site.yml). It +consists of a sequence of plays, each applying one role from +`digitalboard.core` to a host group. All plays run with +`become: yes`. Plays whose group has no members in an inventory run as a +**no-op**. + +| # | Play / role | `hosts:` | Target in `demo-gymburgdorf`? | +| --- | --- | --- | --- | +| 1 | `digitalboard.core.base` | `all_servers` | ✅ all 4 hosts | +| 2 | `digitalboard.core.traefik` | `traefik_servers_backend` | — no-op (vagrant-only group) | +| 3 | `digitalboard.core.traefik` | `traefik_servers_dmz` | — no-op (vagrant-only group) | +| 4 | `digitalboard.core.traefik` | `traefik_servers:!traefik_servers_dmz:!traefik_servers_backend` | ✅ all 4 (dmz on `reverseproxy`, otherwise backend) | +| 5 | `digitalboard.core.httpbin` | `httpbin_servers` | — no-op | +| 6 | `digitalboard.core.389ds` | `ds389_servers` | — no-op | +| 7 | `digitalboard.core.keycloak` | `keycloak_servers` | — no-op | +| 8 | `digitalboard.core.garage` | `garage_servers` | ✅ `storage` | +| 9 | `digitalboard.core.collabora` | `collabora_servers` | ✅ `application` | +| 10 | `digitalboard.core.authentik` | `authentik_servers` | ✅ `application` | +| 11 | `digitalboard.core.authentik_outpost_ldap` | `authentik_outpost_ldap_servers` | ✅ `application` | +| 12 | `digitalboard.core.nextcloud` | `nextcloud_servers` | ✅ `application` | +| 13 | `digitalboard.core.drawio` | `drawio_servers` | ✅ `application` | +| 14 | `digitalboard.core.send` | `send_servers` | ✅ `application` | +| 15 | `digitalboard.core.opnform` | `opnform_servers` | ✅ `application` | +| 16 | `digitalboard.core.homarr` | `homarr_servers` | ✅ `application` | +| 17 | `digitalboard.core.bookstack` | `bookstack_servers` | ✅ `application` | +| 18 | `digitalboard.core.opencloud` | `opencloud_servers` | — no-op (no group) | + +> **Three traefik plays, two topologies.** `vagrant` splits the reverse +> proxy into `traefik_servers_dmz` + `traefik_servers_backend` (plays 2 +> and 3). The demo inventories (e.g. `demo-gymburgdorf`) instead group +> all hosts under `traefik_servers` and select dmz/backend per host via +> `traefik_mode`; play 4's `:!…` intersection targets exactly those +> hosts and stays a no-op for the vagrant split. Each topology thus +> triggers only the traefik play(s) that fit it — no host runs traefik +> twice. +> +> Which plays take effect for a tenant is controlled **solely through +> group membership** in `hosts.yml`. A service becomes active as soon as +> its `_servers` group contains a host and a matching +> `host_vars//.yml` exists. + +## Running Ansible + +**Prerequisite:** collections installed (`make install`) and logged in +to OpenBao in the **same shell** (`VAULT_TOKEN` set) — without a token, +the `community.hashi_vault` lookups fail. Login procedure: +[secrets.md § OpenBao login](secrets.md#openbao-login). +Initial setup step by step: [getting_started.md](getting_started.md). + +### Via Makefile (recommended) + +```bash +make ping_demo # Smoke test (ping) against all demo inventories +make deploy_site_demo_gymburgdorf # single demo site +make deploy_site_demo # all three demo sites in sequence +``` + +The Make targets encapsulate the full `ansible-playbook` invocation +including `--diff` and the macOS fork env var. All targets: +[operations.md § Makefile reference](operations.md#makefile-reference). + +### Direct `ansible-playbook` invocation + +When you need flags that the targets do not set: + +```bash +# Full deploy of an inventory +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --diff + +# Only one host (e.g. just the application machine) +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --limit application + +# Dry run without changes +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --check --diff +``` + +> Because service selection runs through the groups in `hosts.yml` (not +> through tags), `--limit ` is the usual way to narrow down a +> deploy. `--check` is only of limited value with the Docker Compose-based +> roles — some tasks report "changed" because they only learn the actual +> container state at runtime. + +Deploy flow and play order: [operations.md § Deploy](operations.md#deploy). + +## Where parameters belong + +| Variable group | File in `demo-gymburgdorf/` | Why | +| --- | --- | --- | +| `vault_addr`, `vault_mount` | `group_vars/all/vault.yml` | Bao endpoint applies site-wide | +| `docker_registry_mirrors` | `group_vars/all/docker.yml` | Pulls from mirror on all hosts | +| `traefik_acme_*`, `traefik_use_ssl`, `traefik_cert_mode`, `traefik_log_level` | `group_vars/traefik_servers/traefik.yml` | applies to all Traefik instances (dmz + backend) | +| `traefik_mode: backend` | `group_vars/backend_servers/traefik.yml` | default for app + storage | +| `traefik_mode: dmz`, `traefik_dmz_exposed_services` | `host_vars/reverseproxy/traefik.yml` | host-specific override, only meaningful there | +| `nextcloud_*`, `authentik_*`, `collabora_*`, `drawio_*`, `send_*`, `opnform_*`, `homarr_*`, `bookstack_*` | `host_vars/application/.yml` | service runs on `application` | +| `garage_*` | `host_vars/storage/garage.yml` | service runs on `storage` | +| Secrets (passwords, tokens, keys) | inline var with `lookup('community.hashi_vault.hashi_vault', …)` | single source of truth via Bao, see [secrets.md](secrets.md) | + +## Service parameters in detail + +Complete variable lists are in the `defaults/main.yml` of the respective +role in `digitalboard.core`. Below are the parameters maintained in the +demo inventories per service — as guidance on which fields a new tenant +typically needs to set. + +### traefik + +| Variable | Example / default | Purpose | +| --- | --- | --- | +| `traefik_mode` | `dmz` \| `backend` | Provider mode: `dmz` = file provider (public-facing, no Docker socket), `backend` = docker provider (auto-discovery via container labels) | +| `traefik_cert_mode` | `acme` \| `selfsigned` | Certificate source | +| `traefik_use_ssl` | `true` | TLS active | +| `traefik_ssl_email` | `hostmaster@digitalboard.ch` | ACME contact | +| `traefik_log_level` | `DEBUG` (role default `INFO`) | reduce for prod | +| `traefik_network` | `proxy` | Docker network for backend mode | +| `traefik_acme_dns_zone` | `demo-gymb._acme.digitalboard.ch` | RFC2136 update zone | +| `traefik_acme_dns_nameserver` | from Bao / `172.16.9.169` (DMZ override) | TSIG update target | +| `traefik_acme_tsig_algorithm` / `_key` / `_secret` | `hmac-sha256` / Bao | TSIG signature | +| `traefik_acme_tcp_only` | `true` | force DNS lookups over TCP/53 | +| `traefik_acme_disable_ans_checks` | `true` (DMZ only) | skip NS propagation poll | +| `traefik_dmz_exposed_services` | list (DMZ) | which backends the DMZ Traefik routes | + +### authentik (IdP — OIDC + LDAP outpost backend) + +| Variable | Purpose | +| --- | --- | +| `authentik_domains` | public FQDNs (`auth.gymb.souveredu.ch`) | +| `authentik_host_rewrite_domains` | internal `*.int.*` names for LAN server-to-server | +| `authentik_secret_key`, `authentik_postgres_password` | Bao lookup | +| `authentik_ldap_apps`, `authentik_ldap_outpost` | LDAP app + outpost definition (base_dn, token) | +| `authentik_proxy_apps` | ForwardAuth apps (slug, external/internal_host, allowed_groups) | + +### authentik_outpost_ldap + +| Variable | Purpose | +| --- | --- | +| `authentik_outpost_ldap_host` | internal Authentik host (`https://auth.int.…`) | +| `authentik_outpost_ldap_token` | outpost token (Bao, identical to `authentik.ldap_outpost_token`) | + +### nextcloud + +| Variable | Purpose | +| --- | --- | +| `nextcloud_image` | image tag (pin to patched version) | +| `nextcloud_domains` | first entry = canonical public FQDN, further `*.int.*` | +| `nextcloud_admin_user` / `_password`, `nextcloud_postgres_password` | admin + DB (Bao) | +| `nextcloud_use_s3_storage`, `nextcloud_s3_*` | S3 primary storage via Garage (key/secret via `garage_credentials` lookup) | +| `nextcloud_enable_collabora`, `nextcloud_collabora_domain` / `_public_domain` | WOPI integration | +| `nextcloud_enable_drawio`, `nextcloud_drawio_url` | Draw.io integration | +| `nextcloud_oidc_providers` | OIDC login via Authentik (discovery_url, client_id/secret) | +| `nextcloud_ldap_enabled`, `nextcloud_ldap_config` | LDAP backend against Authentik outpost | +| `nextcloud_apps_to_install` | app list (groupfolders, richdocuments, spreed, user_ldap, …) | +| `nextcloud_allow_local_remote_servers`, `nextcloud_extra_hosts` | LAN-only routing for server-to-server calls | + +### collabora + +| Variable | Purpose | +| --- | --- | +| `collabora_domains` | public + internal FQDN | +| `collabora_allowed_domains`, `collabora_frame_ancestors` | allowed WOPI hosts / iframe embedding | + +### drawio + +| Variable | Purpose | +| --- | --- | +| `drawio_domain`, `drawio_extra_domains` | public + internal FQDN | +| `drawio_authentik_forward_auth`, `_url` | access protection via Authentik ForwardAuth | + +### garage (S3 object store) + +| Variable | Purpose | +| --- | --- | +| `garage_s3_domains` | first entry = public S3 FQDN, further `*.int.*` | +| `garage_webui_domain`, `garage_webui_enabled` | admin WebUI | +| `garage_webui_authentik_forward_auth`, `_url` | WebUI behind Authentik (admins only) | +| `garage_rpc_secret`, `garage_admin_token`, `garage_metrics_token` | Bao lookup | +| `garage_bootstrap_*` | single-node cluster bootstrap (zone, capacity) | +| `garage_s3_keys` | keys + buckets + permissions (e.g. `nextcloud`) | + +### send / opnform / homarr / bookstack + +Same pattern: `_domain`/`_domains` (+ `*.int.*`), +`_base_url`, admin credentials and app keys via Bao lookup, +plus OIDC integration with Authentik (`_oidc_*`: issuer, client_id, +client_secret, admin_group). For the concrete fields, see the respective +`host_vars/application/.yml`. + +## Variable cheatsheet + +Short form of the location table above — "which variable goes where": + +- **Site-wide** → `group_vars/all/` (Bao endpoint, Docker mirror) +- **All Traefik** → `group_vars/traefik_servers/` +- **app + storage** → `group_vars/backend_servers/` +- **Single host** → `host_vars//.yml` +- **Secrets** → always Bao lookup, never plaintext (see [secrets.md](secrets.md)) diff --git a/docs/getting_started.md b/docs/getting_started.md new file mode 100644 index 0000000..6c06376 --- /dev/null +++ b/docs/getting_started.md @@ -0,0 +1,93 @@ + +# Getting Started + +[← Documentation index](README.md) + +From zero to your first deploy. This page walks through prerequisites, +setup, and the first Ansible run. Deeper details are linked along the way. + +## Prerequisites + +### Access (to be set up out-of-band) + +- **WKS account with OIDC access to OpenBao.** The login runs via + `bao login -method=oidc -path=Digitalboard`. Without an authorized + account, authentication fails — and without a token there is no + secret lookup, hence no deploy. +- **Bao policy / mount read.** The account needs **Read** on the + mount of the target inventory (e.g. `demo-gymburgdorf/data/*`). Which + paths an inventory reads is documented in the `host_vars/.../.yml` + (see [secrets.md § Secret pattern](secrets.md#secret-pattern-bao-lookup)). +- **SSH key on the target hosts.** The hosts are provisioned as `root` + (`ansible_user: root`, no bastion/jump host). Your own public key + must be placed out-of-band as `root` on the hosts. +- **Network access (VPN).** `bao.digitalboard.ch` and the host networks + (`172.16.9.0/24` DMZ, `172.16.19.0/24` backend) are not publicly + reachable — access requires VPN/network access into the Digitalboard network. + +### Tools on the control node + +- `ansible` (Core ≥ 2.15) — `ansible --version` to check +- `bao` CLI ([OpenBao](https://openbao.org/)) — e.g. + `sudo pacman -S openbao python-hvac` (Arch) or via Homebrew +- `python-hvac` (for `community.hashi_vault` lookups) +- On macOS: `OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES` (set in the + [Makefile](../Makefile)) — without this env var, Ansible forks crash + on the first Bao lookup, because the Objective-C runtime is not + fork-safe. + +## 1. Clone the repo and install collections + +```bash +git clone https://git.digitalboard.ch/Digitalboard/reference-ansible +cd reference-ansible +make install # community.hashi_vault + digitalboard.core into ./collections/ +``` + +There is **no** `roles/` directory in the repo — all roles come from +the `digitalboard.core` collection. See +[inventories.md § Repo layout](inventories.md#repo-layout-and-role-origin). + +## 2. Log in to OpenBao + +The login must happen in the **same shell** in which +`ansible-playbook` then runs — details and the `make bao` caveat in +[secrets.md § OpenBao login](secrets.md#openbao-login): + +```bash +export BAO_ADDR=https://bao.digitalboard.ch +bao login -method=oidc -path=Digitalboard +export VAULT_TOKEN=$(bao print token) +``` + +## 3. Check connectivity (smoke test) + +```bash +make ping_demo # ping module against all three demo inventories +``` + +If a host does not respond, it is usually due to the SSH key +(prerequisite above) or missing network access (VPN). + +## 4. Run Ansible (deploy) + +```bash +make deploy_site_demo_gymburgdorf # single demo site +``` + +At its core, `make` only calls `ansible-playbook` — the equivalent +direct invocation: + +```bash +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --diff +``` + +All variants (direct invocation, `--limit`, `--tags`, check mode) and +the make targets are documented in [ansible.md § Running Ansible](ansible.md#running-ansible). + +## Next steps + +- **What happens during a deploy?** → [ansible.md § Playbook](ansible.md#playbook-siteyml) +- **Create a new tenant** → [inventories.md § Walkthrough](inventories.md#walkthrough-creating-a-new-demo-tenant) +- **Store secrets in Bao** → [secrets.md](secrets.md) diff --git a/docs/inventories.md b/docs/inventories.md new file mode 100644 index 0000000..ff3a903 --- /dev/null +++ b/docs/inventories.md @@ -0,0 +1,179 @@ + +# Repo layout & inventories + +[← Documentation index](README.md) + +## Repo layout and role origin + +```text +reference-ansible/ +├── Makefile # deploy targets, OIDC login, OBJC fork workaround +├── ansible.cfg # collections_path, remote_user=root, hashi_vault auth_method=token +├── requirements.yml # community.hashi_vault + digitalboard.core (Git) +├── playbooks/site.yml # play sequence (see ansible.md) +├── scripts/bao-seed.sh # seed/merge OpenBao secrets per inventory +├── docs/ # this documentation +├── collections/ # ← installed by `make install`, gitignored +│ └── ansible_collections/ +│ └── digitalboard/core/ +│ └── roles/ # 🔑 THE ROLES LIVE HERE, NOT in the repo root +└── inventories/ + ├── demo-gymburgdorf/ # reference inventory of this documentation + ├── demo-mbazürich/ + ├── demo-phbern/ + └── vagrant/ # local test inventory with its own topology +``` + +> **Important:** There is **no** `roles/` directory in the repo root. All +> roles come from the `digitalboard.core` collection (see +> [requirements.yml](../requirements.yml)), installed via +> `make install` into `./collections/`. Plays reference them by +> FQCN `digitalboard.core.`. + +## Available inventories + +| Inventory | Purpose | +| --- | --- | +| [`demo-gymburgdorf/`](../inventories/demo-gymburgdorf/) | Demo tenant — **recommended as the template for new tenants** | +| [`demo-mbazürich/`](../inventories/demo-mbazürich/) | Demo tenant | +| [`demo-phbern/`](../inventories/demo-phbern/) | Demo tenant | +| [`vagrant/`](../inventories/vagrant/) | local test VMs; **incompatible group topology** with the demo inventories | + +## Inventory topology (`demo-gymburgdorf`) + +```mermaid +flowchart LR + classDef dmz fill:#fee2e2,stroke:#991b1b,color:#000 + classDef app fill:#dcfce7,stroke:#166534,color:#000 + classDef stor fill:#dbeafe,stroke:#1e40af,color:#000 + classDef turn fill:#fef9c3,stroke:#854d0e,color:#000 + + subgraph ALL["group: all_servers"] + direction LR + subgraph DMZ["DMZ 172.16.9.0/24"] + RP["reverseproxy
172.16.9.111
traefik_mode: dmz"]:::dmz + TURN["turn
172.16.9.112
(no role in site.yml yet)"]:::turn + end + subgraph BE["Backend 172.16.19.0/24
group: backend_servers"] + APP["application
172.16.19.101
traefik_mode: backend
+ authentik, authentik_outpost_ldap,
nextcloud, collabora, drawio, …"]:::app + ST["storage
172.16.19.102
traefik_mode: backend
+ garage (S3)"]:::stor + end + end + + RP -.HTTPS in, HTTP out.-> APP + RP -.HTTPS in, HTTP out.-> ST +``` + +**Group memberships (from [hosts.yml](../inventories/demo-gymburgdorf/hosts.yml)):** + +| Group | Members | Purpose | +| --- | --- | --- | +| `all_servers` | `reverseproxy`, `application`, `storage`, `turn` | base role for all hosts | +| `traefik_servers` | `children: all_servers` (all 4 hosts) | Traefik everywhere; DMZ/backend via `traefik_mode` | +| `backend_servers` | `application`, `storage` | sets `traefik_mode: backend` via group_var | +| `garage_servers` | `storage` | single-host wrapper for the Garage role | +| `nextcloud_servers`, `collabora_servers`, `drawio_servers`, `authentik_servers`, `authentik_outpost_ldap_servers`, `send_servers`, `opnform_servers`, `homarr_servers`, `bookstack_servers` | only `application` each | single-host wrappers | + +> **Difference from the `vagrant` inventory:** `vagrant` structures +> Traefik differently — via the children groups `traefik_servers_dmz` and +> `traefik_servers_backend` instead of via `backend_servers` + +> `host_vars` override. The two topologies are **structurally +> incompatible**; a 1:1 mapping is not possible. For new tenants, therefore, +> take `demo-gymburgdorf` as the template. + +## Standard folder structure of an inventory entry + +A fully built-out inventory follows this layout (example +`demo-gymburgdorf`). Currently only this inventory is built out; +`demo-mbazürich` and `demo-phbern` so far contain only `hosts.yml`. + +```text +inventories/demo-/ +├── hosts.yml # REQUIRED — hosts, IPs, group topology +├── group_vars/ +│ ├── all/ +│ │ ├── vault.yml # REQUIRED — vault_addr, vault_mount (Bao) +│ │ ├── ansible.yml # ansible_python_interpreter etc. +│ │ └── docker.yml # docker_registry_mirrors +│ ├── traefik_servers/ +│ │ └── traefik.yml # ACME/TSIG, TLS — applies to ALL Traefik instances +│ └── backend_servers/ +│ └── traefik.yml # traefik_mode: backend (default for app + storage) +└── host_vars/ + ├── reverseproxy/ + │ └── traefik.yml # traefik_mode: dmz + DMZ-specific ACME overrides + ├── application/ + │ ├── main.yml # comment only: which services run here + │ ├── traefik.yml # traefik_dmz_exposed_services (what the DMZ routes) + │ └── .yml # one file per service (nextcloud, authentik, …) + └── storage/ + ├── main.yml # same as above + ├── traefik.yml # traefik_extra_hosts + traefik_dmz_exposed_services + └── garage.yml # service vars for garage +``` + +**Conventions:** + +- **`hosts.yml` is the only hard required file.** Vars are + optional — if one is missing, the role defaults from + `digitalboard.core` take effect. A new inventory therefore starts minimally with + only `hosts.yml` (just like `demo-mbazürich`/`demo-phbern`). +- **`group_vars/all/vault.yml`** is effectively required as soon as + Bao lookups are supposed to work — without `vault_mount`/`vault_addr` the + secret lookups fail. +- **One file per service** under `host_vars//.yml`. The + file name is free (Ansible loads all YAMLs in the directory); by + convention it is named like the role. Which variables belong where: + [ansible.md § Where parameters belong](ansible.md#where-parameters-belong). +- **`main.yml` per host** is pure documentation — a comment indicating which + services run on the host. Carries no productive vars. +- **`host_vars//traefik.yml`** declares via + `traefik_dmz_exposed_services` which local services the + DMZ Traefik should make reachable from outside. The DMZ reads this + list via `hostvars[]` and renders its routers from it. A new + service exposed externally = a new entry here. Mechanics: + [ansible.md § traefik](ansible.md#traefik). + +## Walkthrough: Creating a new demo tenant + +Recommended template: **`demo-gymburgdorf`** (not `vagrant`, because its +group topology is incompatible). + +1. **Copy the inventory:** + + ```bash + cp -r inventories/demo-gymburgdorf inventories/demo- + ``` + +2. **Adjust `hosts.yml`:** IPs, hostnames per host. + +3. **`group_vars/all/vault.yml`** — set `vault_mount` to the new + tenant mount (`demo-`). + +4. **`group_vars/traefik_servers/traefik.yml`** — + point `traefik_acme_dns_zone` and the `acme-tsig` lookup paths to the + new zone / the new Bao path. + +5. Go through **`host_vars/application/*.yml`** and **`host_vars/storage/*.yml`**: + FQDNs to the new domain pattern (e.g. + `*..souveredu.ch`), Bao lookup paths to `demo-/data/…`. + +6. **Prepare OpenBao** (out-of-band, not via Ansible): + - Create a new KV-v2 mount `demo-`. + - Write secrets: `acme-tsig`, `authentik`, `nextcloud`, + `garage`, … — conveniently via `make seed_bao_` (see + [scripts/bao-seed.sh](../scripts/bao-seed.sh) and + [secrets.md § Demo-Only-Defaults](secrets.md#demo-only-defaults--must-be-overridden)). + - Policy for the deploy token: read on `demo-/data/*`. + +7. **DNS:** Create the TSIG update zone (`demo-._acme.digitalboard.ch`) at + `ns1.digitalboard.ch`, CNAMEs + `_acme-challenge.*..` pointing there. + +8. **Makefile** — add a new target modeled on + `deploy_site_demo_gymburgdorf` and add it to `deploy_site_demo`; + likewise a `seed_bao_` target. + +9. **Smoke test:** `ansible all -i inventories/demo-/hosts.yml -m ping`. + +10. **Deploy:** Bao login + `make deploy_site_demo_`. diff --git a/docs/operations.md b/docs/operations.md new file mode 100644 index 0000000..3176312 --- /dev/null +++ b/docs/operations.md @@ -0,0 +1,136 @@ + +# Setup & operations + +[← Documentation index](README.md) + +## Prerequisites (control node) + +- `ansible` (Core ≥ 2.15) +- `bao` CLI ([OpenBao](https://openbao.org/)) — e.g. + `sudo pacman -S openbao python-hvac` (Arch) or via Homebrew +- `python-hvac` (for `community.hashi_vault` lookups) +- On macOS: `OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES` (set in the + [Makefile](../Makefile); without this env var, Ansible forks crash + on the first `community.hashi_vault` lookup, because the + Objective-C runtime is not fork-safe) + +## Initial setup + +```bash +git clone +cd reference-ansible +make install # Galaxy + digitalboard.core into ./collections/ +``` + +`make install` installs `community.hashi_vault` and the +`digitalboard.core` collection (Git, see [requirements.yml](../requirements.yml)) +into `./collections/`. There is **no** `roles/` directory in the +repo root — all roles come from the collection, see +[inventories.md § Repo layout](inventories.md#repo-layout-and-role-origin). + +## Secrets (OpenBao) + +Before every deploy, authenticate to OpenBao in the **same shell**. The +full login flow, the `make bao` caveat, the lookup pattern, and +tenant isolation are documented in **[secrets.md](secrets.md)**. + +## Smoke test + +```bash +make ping_demo # pings all three demo inventories (ping module) +``` + +## Deploy + +```bash +make deploy_site_demo_gymburgdorf # single demo site +make deploy_site_demo_mbazürich +make deploy_site_demo_phbern +make deploy_site_demo # all three in sequence +``` + +`--diff` is set in the Gymburgdorf target → changes are visible per task. +The play order and which plays run as no-ops: +see [ansible.md § Playbooks](ansible.md#playbook-siteyml). + +## Makefile reference + +The [Makefile](../Makefile) bundles setup, secret handling, and deploy. +It defines no variables for passing in except `DRY_RUN` (for the +`seed_bao_*` targets) — control is via the chosen target. + +### Exported env vars (apply to all targets) + +| Variable | Value | Purpose | +| --- | --- | --- | +| `BAO_ADDR` | `https://bao.digitalboard.ch` | OpenBao endpoint for `bao` and `community.hashi_vault` calls | +| `OBJC_DISABLE_INITIALIZE_FORK_SAFETY` | `YES` | macOS fork safety: without this var, Ansible forks crash on the first `hashi_vault` lookup, because the Objective-C runtime is not fork-safe | + +> Both are set via `export` at the top of the Makefile and thus +> inherited by every target shell process — regardless of which target runs. + +### Setup & secrets + +| Target | Effect | +| --- | --- | +| `make install` | `ansible-galaxy collection install -r requirements.yml -p collections` — installs `community.hashi_vault` + `digitalboard.core` into `./collections/` | +| `make bao` | `bao login -method=oidc -path=Digitalboard role=default` + sets `VAULT_TOKEN` via `$(eval …)`. ⚠️ The token only lives **within this single `make` invocation** — see caveat below | +| `make seed_bao_gymburgdorf` | Seed/merge OpenBao secrets for `demo-gymburgdorf` via [scripts/bao-seed.sh](../scripts/bao-seed.sh). Idempotent: existing keys remain, only missing ones are generated | +| `make seed_bao_mbazürich` | same for `demo-mbazürich` | +| `make seed_bao_phbern` | same for `demo-phbern` | + +> The `seed_bao_*` targets understand `DRY_RUN=1` — shows the diff without +> writing: `make seed_bao_gymburgdorf DRY_RUN=1`. Requirement: +> `bao`, `jq`, `openssl` in `$PATH` and a valid `VAULT_TOKEN`. + +### Smoke test & deploy + +| Target | Effect | +| --- | --- | +| `make ping_demo` | `ansible … -m ping` against all three demo inventories in sequence; failures of individual hosts do not abort (`\|\| true`) | +| `make deploy_site_demo_gymburgdorf` | `ansible-playbook playbooks/site.yml -i …/demo-gymburgdorf/hosts.yml --diff` | +| `make deploy_site_demo_mbazürich` | same for `demo-mbazürich` — **without** `--diff` | +| `make deploy_site_demo_phbern` | same for `demo-phbern` — **without** `--diff` | +| `make deploy_site_demo` | calls the three `deploy_site_demo_*` targets in sequence | + +> **Inconsistency:** Only the Gymburgdorf target sets `--diff`. For +> `mbazürich` and `phbern` you do not see the task changes — if +> needed, invoke directly with `ansible-playbook … --diff`, see +> [ansible.md § Running Ansible](ansible.md#running-ansible). + +### Token caveat (`make bao`) + +`make bao` alone is **not** enough for a deploy: each `make` target +runs in its own shell, the `VAULT_TOKEN` set there only lives +during `make bao` itself and is already gone in the next `make deploy_…`. +Two working approaches: + +```bash +# Variant A — log in manually in the active shell (survives multiple make invocations) +export BAO_ADDR=https://bao.digitalboard.ch +bao login -method=oidc -path=Digitalboard +export VAULT_TOKEN=$(bao print token) +make deploy_site_demo_gymburgdorf + +# Variant B — chain both as ONE make invocation (token lives for the chain) +make bao deploy_site_demo_gymburgdorf +``` + +Login details and the secret pattern: [secrets.md](secrets.md#openbao-login). + +## Known gaps and trade-offs + +- **`opencloud` in `demo-gymburgdorf`:** Play present, but no + `opencloud_servers` group — runs as a no-op. If needed, add a group + + `host_vars`, see [inventories.md](inventories.md#walkthrough-creating-a-new-demo-tenant). +- **`turn` host:** defined in the DMZ, but no STUN/TURN role in + [playbooks/site.yml](../playbooks/site.yml) — provisioned only via `base` + + `traefik`. +- **Idempotency:** Roles are Docker-Compose-based; re-runs can + trigger container restarts when Compose inputs change. No + rollback mechanism — on failure, roll back manually. +- **TLS renewal:** handled internally by Traefik via ACME, no external + renew cron in the repo. +- **CI/testing:** currently not in the repo; smoke test via `make ping_demo`. +- **Logging:** `traefik_log_level: DEBUG` in `demo-gymburgdorf` and + `vagrant` (role default `INFO`) — reduce before adapting to prod. diff --git a/docs/secrets.md b/docs/secrets.md new file mode 100644 index 0000000..649cc59 --- /dev/null +++ b/docs/secrets.md @@ -0,0 +1,99 @@ + +# Secrets & security + +[← Documentation index](README.md) + +> This repo is explicitly intended for **demo setups**. All +> default values in the roles are insecure and are overridden in +> `demo-*` inventories via Bao lookups or host_vars. + +## OpenBao login + +A prerequisite is a WKS account with OIDC access to OpenBao and a +read policy on the inventory mount — see +[getting_started.md § Vorbedingungen](getting_started.md#prerequisites). + +Before each deploy, authenticate in **the same shell** in which +`ansible-playbook` then runs: + +```bash +export BAO_ADDR=https://bao.digitalboard.ch +bao login -method=oidc -path=Digitalboard +export VAULT_TOKEN=$(bao print token) +``` + +> ⚠️ `make bao` alone is **not** enough — every `make` target runs in +> a new shell, and the `VAULT_TOKEN` set there lives only during +> `make bao` itself. Either run the three commands above manually +> or chain `make bao deploy_site_demo_gymburgdorf` as **one** call +> — otherwise the deploy has no token. + +## Secret pattern (Bao lookup) + +Secrets are never stored in plaintext, but read from +OpenBao at runtime: + +```yaml +# host_vars/.../.yml — one lookup per service path, +# individual keys as properties: +_nextcloud: "{{ lookup('community.hashi_vault.hashi_vault', + vault_mount + '/data/nextcloud', url=vault_addr) }}" +nextcloud_admin_password: "{{ _nextcloud.admin_password }}" +nextcloud_postgres_password: "{{ _nextcloud.postgres_password }}" +``` + +- `vault_mount` and `vault_addr` come from + [group_vars/all/vault.yml](../inventories/demo-gymburgdorf/group_vars/all/vault.yml). +- KV-v2 paths need an explicit `/data/` in the path — Ansible does not + resolve this on its own. +- `vault_mount` is unique per inventory (`demo-gymburgdorf`, + `demo-phbern`, …) → tenant isolation in Bao via mount + policy. + +Secrets are seeded idempotently with [scripts/bao-seed.sh](../scripts/bao-seed.sh) (or +`make seed_bao_`): existing keys remain, +only missing ones are generated. OIDC client secrets are kept in sync between +`/data/authentik` and the respective service secret. + +## Demo-only defaults — must be overridden + +These defaults in `digitalboard.core` are insecure. In every +**production-grade** deployment they must be overridden via a Bao lookup or host_var: + +| Variable | Default | Where to override | +| --- | --- | --- | +| `keycloak_admin_password` | `changeme` | host_vars `keycloak_servers` | +| `keycloak_postgres_password` | `changeme` | same as above | +| `authentik_secret_key` | `changeme-generate-a-random-string` | `host_vars/application/authentik.yml` | +| `authentik_postgres_password` | `changeme` | same as above | +| `nextcloud_admin_password` | `admin` | `host_vars/application/nextcloud.yml` | +| `nextcloud_postgres_password` | `changeme` | same as above | +| `nextcloud_s3_key` / `nextcloud_s3_secret` | `changeme` / `changeme` | same as above | +| `garage_webui_password` | `admin` | `host_vars/storage/garage.yml` | +| `garage_rpc_secret` | `0123…cdef` (64-hex constant) | same as above | +| `garage_admin_token` | identical to `rpc_secret` | same as above | +| `garage_metrics_token` | identical to `rpc_secret` | same as above | + +> **Convention:** Every value above **must** have a Bao lookup in +> `demo-*/host_vars/.../...yml` before the +> inventory counts as deployable. + +## Threat boundaries (status: demo) + +| Boundary | Status | Note | +| --- | --- | --- | +| DMZ ↔ backend (172.16.9 ↔ 172.16.19) | **plaintext HTTP** | auth bearer, OIDC code, session cookies travel unencrypted. Demo-ok; prod: mTLS or WireGuard overlay. | +| Host firewall | **missing** | The `base` role installs no UFW/nftables. Segmentation depends on the hypervisor/VLAN. | +| SSH | `ansible_user: root` | No bastion, no jump host. Key distribution out-of-band. | +| Authentik SPOF | **accepted** | IdP and SP services on the same host (`application`). Authentik outage = login outage including LDAP outpost. No break-glass path. | +| ACME TSIG key | Bao lookup | One TSIG key per demo zone (`acme_update_key_demo_gymb`), zone-isolated. Rotation manual. | +| Backup/DR | **out-of-scope** | Garage `replication_factor: 1`, no Postgres backup job, no Bao snapshot cron. | + +## Add for production adaptation + +- Host FW (extend the `base` role or a dedicated `firewall` role). +- mTLS or WireGuard between DMZ and backend. +- Authentik on a separate host, with a recovery admin token. +- Bao policies per inventory mount (read-only for the deploy token, + write-only for the bootstrap job). +- Backup cron for Postgres + Garage + Bao. +- SSH bastion + key rotation. diff --git a/docs/testing.md b/docs/testing.md new file mode 100644 index 0000000..9ded098 --- /dev/null +++ b/docs/testing.md @@ -0,0 +1,93 @@ + +# Testing + +[← Documentation index](README.md) + +> **Status:** This repo contains **no** automated test suite +> and **no** CI pipeline. It is the inventory/configuration +> layer — the testable logic (roles) lives in +> [`digitalboard.core`](https://git.digitalboard.ch/Digitalboard/digitalboard.core). +> Role tests (Molecule or similar) therefore belong in the core repo, not +> here. + +What is sensibly testable here are **inventory and playbook errors +before the actual deploy**. There are three levels for that — from fast and +risk-free to fully against the hosts. + +## 1. Static checks (no host access needed) + +No `VAULT_TOKEN`, no network access required: + +```bash +# YAML syntax of all inventory files +yamllint inventories/ + +# Ansible best-practice lint over the playbook +ansible-lint playbooks/site.yml + +# Playbook syntax + inventory parsing (does not yet resolve Bao lookups) +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --syntax-check +``` + +`yamllint` and `ansible-lint` are not preconfigured in the repo — +they run with their defaults. If project-wide rules are desired, +a `.yamllint`/`.ansible-lint` in the repo root would be the place for it +(see [Open items](#open-items)). + +## 2. Inspect inventory resolution + +Shows the effectively merged variables per host — useful for seeing +precedence surprises (group_vars vs. host_vars) before the deploy. +Bao lookups are evaluated here, so `VAULT_TOKEN` is needed (see +[secrets.md](secrets.md#openbao-login)): + +```bash +# group/host structure as a tree +ansible-inventory -i inventories/demo-gymburgdorf/hosts.yml --graph + +# all vars of a host (merged) +ansible-inventory -i inventories/demo-gymburgdorf/hosts.yml --host application +``` + +## 3. Smoke test & dry run (against the hosts) + +Requires SSH access and `VAULT_TOKEN` — for prerequisites see +[getting_started.md](getting_started.md#prerequisites): + +```bash +# reachability of all demo hosts (ping module) +make ping_demo + +# dry run: shows what WOULD change, without writing +ansible-playbook playbooks/site.yml \ + -i inventories/demo-gymburgdorf/hosts.yml --check --diff +``` + +> `--check` is only of limited value with the Docker-Compose-based roles: +> some tasks report "changed" because they only know the real +> container state at runtime. As a plausibility check +> (does the playbook run through, are the vars correct?) it is still +> useful. More on the invocation variants: +> [ansible.md § Running Ansible](ansible.md#running-ansible). + +## Recommended pre-deploy workflow + +```bash +yamllint inventories/ && ansible-lint playbooks/site.yml # 1. static +ansible-playbook playbooks/site.yml -i /hosts.yml --syntax-check +make ping_demo # 2. reachability +ansible-playbook playbooks/site.yml -i /hosts.yml --check --diff # 3. dry run +# only then: real deploy +``` + +## Open items + +- **No CI** — lint/syntax checks run only manually. A Gitea/ + CI workflow that runs the static checks from level 1 on every push + would be the next step. +- **No lint configuration** — `yamllint`/`ansible-lint` run with + defaults; project-wide rules (`.yamllint`, `.ansible-lint`) are missing. +- **Role tests** — belong in + [`digitalboard.core`](https://git.digitalboard.ch/Digitalboard/digitalboard.core), + not in this repo. diff --git a/inventories/demo-gymburgdorf/host_vars/application/authentik.yml b/inventories/demo-gymburgdorf/host_vars/application/authentik.yml index fd94e37..55d6a3a 100644 --- a/inventories/demo-gymburgdorf/host_vars/application/authentik.yml +++ b/inventories/demo-gymburgdorf/host_vars/application/authentik.yml @@ -2,15 +2,21 @@ # Bao secret expected at /data/authentik with keys: # secret_key, postgres_password, admin_password, # ldap_outpost_token, -# nextcloud_oidc_secret +# nextcloud_oidc_secret, +# opnform_oidc_secret, homarr_oidc_secret, bookstack_oidc_secret _authentik: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/authentik', url=vault_addr) }}" -# First entry is the canonical public FQDN. Additional entries cover -# internal *.int.* names so server-to-server traffic (e.g. the LDAP -# outpost) hits authentik on a name with a valid internal cert and -# skips the DMZ hop. +# Canonical public FQDN browsers and OIDC iss-claim use. authentik_domains: - "auth.gymb.souveredu.ch" + +# Internal FQDN for server-to-server calls (Nextcloud OIDC discovery, +# token, userinfo; LDAP outpost configuration pull). Traefik rewrites +# the Host header to `authentik_domains[0]` on these routers so authentik +# still emits issuer URLs against the public hostname — that keeps the +# iss claim matching what the browser sees while the traffic itself +# stays inside the LAN (the DMZ has no hairpin-NAT for the public IP). +authentik_host_rewrite_domains: - "auth.int.gymb.souveredu.ch" authentik_secret_key: "{{ _authentik.secret_key }}" authentik_postgres_password: "{{ _authentik.postgres_password }}" @@ -31,6 +37,44 @@ authentik_ldap_outpost: authentik_host: "https://auth.int.gymb.souveredu.ch/" log_level: "info" +# Proxy providers (ForwardAuth) — gate downstream services behind +# authentik. The embedded outpost (which authentik ships out of the box) +# hosts these providers under /outpost.goauthentik.io/auth/traefik on the +# canonical FQDN; the service-side traefik attaches a ForwardAuth +# middleware that talks to that endpoint. +authentik_proxy_apps: + - slug: drawio + name: Drawio + external_host: "https://draw.gymb.souveredu.ch" + internal_host: "http://drawio:8080" + allowed_groups: + - admins + flows: + authentication_slug: default-authentication-flow + authorization_slug: default-provider-authorization-implicit-consent + invalidation_slug: default-provider-invalidation-flow + - slug: garage-webui + name: "Garage S3 Console" + external_host: "https://console.s3.gymb.souveredu.ch" + internal_host: "http://garage-webui:3909" + allowed_groups: + - admins + flows: + authentication_slug: default-authentication-flow + authorization_slug: default-provider-authorization-implicit-consent + invalidation_slug: default-provider-invalidation-flow + +# Bind both proxy providers to authentik's built-in embedded outpost so +# we don't have to deploy a separate proxy outpost container. The +# embedded outpost listens on the same host:9000 as the authentik server +# and exposes /outpost.goauthentik.io/auth/traefik for ForwardAuth. +authentik_proxy_outposts: + - name: "authentik Embedded Outpost" + type: proxy + providers: + - drawio + - garage-webui + # OIDC clients authentik_oidc_apps: - slug: nextcloud @@ -45,10 +89,52 @@ authentik_oidc_apps: authorization_slug: default-provider-authorization-implicit-consent invalidation_slug: default-provider-invalidation-flow scopes: [openid, email, profile, offline_access] + - slug: opnform + name: OpnForm + client_id: opnform + client_secret: "{{ _authentik.opnform_oidc_secret }}" + redirect_uris: + - url: "https://forms.gymb.souveredu.ch/auth/authentik/callback" + matching_mode: strict + signing_key_name: "authentik Self-signed Certificate" + flows: + authorization_slug: default-provider-authorization-implicit-consent + invalidation_slug: default-provider-invalidation-flow + # No separate `groups` scope — authentik's default `profile` mapping + # already emits a `groups` claim built from request.user.groups, so + # OpnForm's admin-group mapping works without an extra scope. + scopes: [openid, email, profile] + - slug: homarr + name: Homarr + client_id: homarr + client_secret: "{{ _authentik.homarr_oidc_secret }}" + redirect_uris: + - url: "https://home.gymb.souveredu.ch/api/auth/callback/oidc" + matching_mode: strict + signing_key_name: "authentik Self-signed Certificate" + flows: + authorization_slug: default-provider-authorization-implicit-consent + invalidation_slug: default-provider-invalidation-flow + scopes: [openid, email, profile] + - slug: bookstack + name: BookStack + client_id: bookstack + client_secret: "{{ _authentik.bookstack_oidc_secret }}" + redirect_uris: + - url: "https://wiki.gymb.souveredu.ch/oidc/callback" + matching_mode: strict + signing_key_name: "authentik Self-signed Certificate" + flows: + authorization_slug: default-provider-authorization-implicit-consent + invalidation_slug: default-provider-invalidation-flow + scopes: [openid, email, profile] authentik_groups: - name: admins - name: users + - name: opnform-admins + - name: homarr-admins + - name: bookstack-admins authentik_local_users: - username: akadmin diff --git a/inventories/demo-gymburgdorf/host_vars/application/bookstack.yml b/inventories/demo-gymburgdorf/host_vars/application/bookstack.yml new file mode 100644 index 0000000..1d0beac --- /dev/null +++ b/inventories/demo-gymburgdorf/host_vars/application/bookstack.yml @@ -0,0 +1,43 @@ +--- +# Bao secret /data/bookstack expected to contain: +# db_root_password, db_password, admin_password, oidc_client_secret, +# app_key (optional — only set when restoring) +_bookstack: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/bookstack', url=vault_addr) }}" + +bookstack_domain: "wiki.gymb.souveredu.ch" +bookstack_extra_domains: + - "wiki.int.gymb.souveredu.ch" +bookstack_base_url: "https://wiki.gymb.souveredu.ch" + +# Override the role-default certresolver ("le") with the value used +# across this demo (matches traefik_ssl_cert_resolver in group_vars). +bookstack_traefik_certresolver: "dns" + +bookstack_db_root_password: "{{ _bookstack.db_root_password }}" +bookstack_db_password: "{{ _bookstack.db_password }}" +bookstack_admin_password: "{{ _bookstack.admin_password }}" +bookstack_admin_email: "admin@gymb.souveredu.ch" +bookstack_admin_name: "BookStack Admin" + +# OIDC against Authentik. BookStack compares OIDC_ISSUER strictly against +# the `iss` claim in the discovery response. Authentik emits the public +# auth.gymb.* hostname there (host-rewrite middleware ensures the claim +# matches what browsers see during login), so the issuer URL must use the +# public FQDN. Pinning auth.gymb.* in /etc/hosts below keeps the actual +# server-to-server traffic on the LAN. +bookstack_oidc_enabled: true +bookstack_oidc_name: "Authentik" +bookstack_oidc_issuer: "https://auth.gymb.souveredu.ch/application/o/bookstack/" +bookstack_oidc_client_id: "bookstack" +bookstack_oidc_client_secret: "{{ _bookstack.oidc_client_secret }}" +bookstack_oidc_additional_scopes: "openid profile email" +bookstack_oidc_user_to_groups: true +bookstack_oidc_groups_claim: "groups" +bookstack_oidc_auto_initiate: false + +# Pin auth.gymb.* to the application host so server-to-server OIDC calls +# (discovery, token, userinfo, jwks) stay in the LAN and reach authentik +# directly without hairpinning through the DMZ (which has no NAT loop +# back to its own public IP). +bookstack_extra_hosts: + - "auth.gymb.souveredu.ch:172.16.19.101" diff --git a/inventories/demo-gymburgdorf/host_vars/application/drawio.yml b/inventories/demo-gymburgdorf/host_vars/application/drawio.yml index beef6a5..2d7f6b6 100644 --- a/inventories/demo-gymburgdorf/host_vars/application/drawio.yml +++ b/inventories/demo-gymburgdorf/host_vars/application/drawio.yml @@ -1,2 +1,19 @@ --- drawio_domain: "draw.gymb.souveredu.ch" + +# Internal FQDN the DMZ reverseproxy uses as backend host so its TLS +# verify matches a cert SAN (the canonical IP-only route has no SAN +# and breaks with "cannot validate certificate ... no IP SANs"). Same +# split-horizon pattern as cloud.int.* / auth.int.* / office.int.*. +drawio_extra_domains: + - "draw.int.gymb.souveredu.ch" + +# Gate drawio behind the authentik embedded outpost (admins-only — +# enforced by the policy-binding on the authentik proxy application). +# ForwardAuth talks to the embedded outpost on the authentik server's +# in-network address. Going via the public FQDN routes through a second +# traefik hop that strips/rewrites X-Forwarded-Host, which breaks +# authentik's provider matching (it returns 404). Plain HTTP to the +# container is the path docs recommend for the embedded outpost. +drawio_authentik_forward_auth: true +drawio_authentik_forward_auth_url: "http://authentik-server-1:9000/outpost.goauthentik.io/auth/traefik" diff --git a/inventories/demo-gymburgdorf/host_vars/application/homarr.yml b/inventories/demo-gymburgdorf/host_vars/application/homarr.yml new file mode 100644 index 0000000..dc3ba2e --- /dev/null +++ b/inventories/demo-gymburgdorf/host_vars/application/homarr.yml @@ -0,0 +1,71 @@ +--- +# Bao secret /data/homarr expected to contain: +# secret_encryption_key (64 hex chars), admin_password, oidc_client_secret +_homarr: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/homarr', url=vault_addr) }}" + +homarr_domain: "home.gymb.souveredu.ch" +homarr_extra_domains: + - "home.int.gymb.souveredu.ch" +homarr_base_url: "https://home.gymb.souveredu.ch" + +homarr_secret_encryption_key: "{{ _homarr.secret_encryption_key }}" +homarr_admin_username: "admin" +homarr_admin_email: "admin@gymb.souveredu.ch" +homarr_admin_password: "{{ _homarr.admin_password }}" + +# OIDC against Authentik. credentials provider stays enabled as a +# break-glass account. +homarr_auth_providers: "credentials,oidc" +homarr_oidc_issuer: "https://auth.int.gymb.souveredu.ch/application/o/homarr/" +homarr_oidc_client_id: "homarr" +homarr_oidc_client_secret: "{{ _homarr.oidc_client_secret }}" +homarr_oidc_client_name: "Authentik" +homarr_oidc_scopes: "openid profile email groups" +homarr_oidc_groups_attribute: "groups" +homarr_oidc_admin_group: "homarr-admins" +homarr_oidc_auto_login: "false" + +# Default board with shortcuts to the other gymburgdorf services. Width +# values describe horizontal grid cells (1-10 desktop / 6 tablet / 2 +# mobile, packed left-to-right). +homarr_apps: + - id: nextcloud + name: Nextcloud + description: "Cloud Storage & Collaboration" + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/nextcloud.png + href: https://cloud.gymb.souveredu.ch + width: 2 + - id: collabora + name: Collabora Office + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/collaboraonline.png + href: https://office.gymb.souveredu.ch + width: 2 + - id: drawio + name: Draw.io + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/drawio.png + href: https://draw.gymb.souveredu.ch + width: 2 + - id: send + name: Send + description: "Encrypted file-share" + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/firefox-send.png + href: https://send.gymb.souveredu.ch + width: 2 + - id: opnform + name: OpnForm + description: "Self-hosted forms" + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/opnform.png + href: https://forms.gymb.souveredu.ch + width: 2 + - id: bookstack + name: BookStack + description: "Wiki & documentation" + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/bookstack.png + href: https://wiki.gymb.souveredu.ch + width: 2 + - id: authentik + name: Authentik + description: "Identity provider" + icon: https://cdn.jsdelivr.net/gh/walkxcode/dashboard-icons/png/authentik.png + href: https://auth.gymb.souveredu.ch + width: 2 diff --git a/inventories/demo-gymburgdorf/host_vars/application/nextcloud.yml b/inventories/demo-gymburgdorf/host_vars/application/nextcloud.yml index 06bb35a..b30a0f9 100644 --- a/inventories/demo-gymburgdorf/host_vars/application/nextcloud.yml +++ b/inventories/demo-gymburgdorf/host_vars/application/nextcloud.yml @@ -4,6 +4,11 @@ _nextcloud: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/nextcloud', url=vault_addr) }}" _authentik: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/authentik', url=vault_addr) }}" +# 33.0.2 hits the PHP 8.4 TypeError in UserConfig::getValueBool() that +# user_ldap triggers on every authenticated request (nextcloud/server +# #59629; fix in 33.0.3). Pin to the patched tag. +nextcloud_image: "nextcloud:33.0.3-fpm" + # First entry is the canonical public FQDN (used for OVERWRITEHOST and # OIDC redirects). Additional entries cover internal *.int.* names so # collabora's WOPI callbacks hit nextcloud on a name with a valid @@ -57,10 +62,25 @@ nextcloud_s3_port: 443 nextcloud_s3_ssl: true nextcloud_s3_usepath_style: true +# OIDC server-to-server discovery / token / userinfo goes to +# auth.int.gymb.souveredu.ch (LAN, RFC1918). Nextcloud's DnsPinMiddleware +# would otherwise block that as "local server access". +nextcloud_allow_local_remote_servers: true + # Share the LDAP docker network with the authentik LDAP outpost nextcloud_extra_networks: - ldap +# Pin the public authentik FQDN to the application host so server-to-server +# OIDC traffic (token, userinfo, jwks — endpoints the discovery doc lists +# under auth.gymb.* even when discovery itself is fetched via auth.int.*) +# stays in the LAN. Without this, curl in the PHP container would hit the +# public IP and time out in the DMZ (no hairpin-NAT). The DnsPin middleware +# only honours /etc/hosts when allow_local_remote_servers is enabled, so +# that flag (set above) is what makes this entry effective. +nextcloud_extra_hosts: + - "auth.gymb.souveredu.ch:172.16.19.101" + # LDAP backend (Authentik LDAP outpost) nextcloud_ldap_enabled: true nextcloud_ldap_config: @@ -98,12 +118,13 @@ nextcloud_oidc_providers: display_name: "Login with Authentik" client_id: nextcloud client_secret: "{{ _authentik.nextcloud_oidc_secret }}" - # Stays on the public FQDN: user_oidc validates the iss claim against - # the discovery host, and authentik returns iss based on the request - # host — using auth.int.* would break the iss match with what the - # browser sees (auth.gymb.*). Routed via the DMZ for now; revisit if - # this becomes a bottleneck. - discovery_url: "https://auth.gymb.souveredu.ch/application/o/nextcloud/.well-known/openid-configuration" + # Discovery via the internal FQDN (LAN-only) — the DMZ has no + # hairpin-NAT for the public IP, so server-to-server calls to + # auth.gymb.* would time out. The traefik router for auth.int.* + # rewrites the Host header to auth.gymb.souveredu.ch before the + # request reaches authentik, so the iss claim authentik emits still + # matches the public hostname the browser sees during login. + discovery_url: "https://auth.int.gymb.souveredu.ch/application/o/nextcloud/.well-known/openid-configuration" scope: "openid email profile" unique_uid: true mapping: diff --git a/inventories/demo-gymburgdorf/host_vars/application/opnform.yml b/inventories/demo-gymburgdorf/host_vars/application/opnform.yml new file mode 100644 index 0000000..1526ac0 --- /dev/null +++ b/inventories/demo-gymburgdorf/host_vars/application/opnform.yml @@ -0,0 +1,60 @@ +--- +# Bao secret /data/opnform expected to contain: +# app_key (must start with "base64:"), jwt_secret, front_api_secret, +# db_password, admin_password, oidc_client_secret +_opnform: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/opnform', url=vault_addr) }}" +_authentik: "{{ lookup('community.hashi_vault.hashi_vault', vault_mount + '/data/authentik', url=vault_addr) }}" + +opnform_domain: "forms.gymb.souveredu.ch" +opnform_extra_domains: + - "forms.int.gymb.souveredu.ch" +opnform_base_url: "https://forms.gymb.souveredu.ch" + +opnform_app_key: "{{ _opnform.app_key }}" +opnform_jwt_secret: "{{ _opnform.jwt_secret }}" +opnform_front_api_secret: "{{ _opnform.front_api_secret }}" +opnform_db_password: "{{ _opnform.db_password }}" + +# Bootstrap admin via API on first run so the manual setup page is +# skipped. The admin credentials are also required to seed the OIDC +# IdentityConnection through OpnForm's API (only an authenticated admin +# can create connections). +opnform_admin_name: "OpnForm Admin" +opnform_admin_email: "admin@gymb.souveredu.ch" +opnform_admin_password: "{{ _opnform.admin_password }}" + +# OIDC against Authentik. Discovery via the internal FQDN keeps +# server-to-server traffic in the LAN; Authentik's host-rewrite router +# rewrites the Host header to auth.gymb.* before the request reaches +# authentik so the iss claim still matches the public hostname browsers +# see during login. +opnform_oidc_enabled: true +# Issuer must use the public auth.gymb.* FQDN: OpnForm does OIDC +# discovery and then validates the token's `iss` claim against this +# value. Authentik emits the public hostname in `iss` (its host-rewrite +# middleware keeps the claim aligned with what browsers see), so an +# internal-FQDN issuer here would fail iss validation. The extra_hosts +# pin below keeps the actual discovery/token/userinfo traffic on the LAN. +opnform_oidc_issuer: "https://auth.gymb.souveredu.ch/application/o/opnform/" +opnform_oidc_client_id: "opnform" +opnform_oidc_client_secret: "{{ _opnform.oidc_client_secret }}" +opnform_oidc_client_name: "Authentik" +opnform_oidc_slug: "authentik" +opnform_oidc_domain: "gymb.souveredu.ch" +opnform_oidc_admin_group: "opnform-admins" + +# Disable password login entirely — every user goes through Authentik. +# All real users have @gymb.souveredu.ch addresses (matching +# opnform_oidc_domain above), so no password fallback is needed. +opnform_oidc_force_login: true + +# Serve a /sso page that jumps straight to Authentik without the email +# login form. Link users to https://forms.gymb.souveredu.ch/sso. +opnform_oidc_sso_entrypoint: true + +# Pin auth.gymb.* to the application host so server-to-server OIDC +# calls (token, userinfo, jwks — endpoints discovery returns under the +# public hostname even when discovery itself is fetched via auth.int.*) +# stay in the LAN. +opnform_extra_hosts: + - "auth.gymb.souveredu.ch:172.16.19.101" diff --git a/inventories/demo-gymburgdorf/host_vars/application/send.yml b/inventories/demo-gymburgdorf/host_vars/application/send.yml new file mode 100644 index 0000000..ee1887a --- /dev/null +++ b/inventories/demo-gymburgdorf/host_vars/application/send.yml @@ -0,0 +1,8 @@ +--- +# Send: anonymized self-hosted file-share (no login). First entry is the +# canonical public FQDN (used as BASE_URL); the *.int.* entry covers the +# server-to-server hop from the DMZ reverseproxy with a cert SAN that +# matches the backend hostname (same split-horizon pattern as cloud/draw). +send_domains: + - "send.gymb.souveredu.ch" + - "send.int.gymb.souveredu.ch" diff --git a/inventories/demo-gymburgdorf/host_vars/application/traefik.yml b/inventories/demo-gymburgdorf/host_vars/application/traefik.yml index d18cd49..a9db420 100644 --- a/inventories/demo-gymburgdorf/host_vars/application/traefik.yml +++ b/inventories/demo-gymburgdorf/host_vars/application/traefik.yml @@ -21,9 +21,26 @@ traefik_dmz_exposed_services: protocol: https - name: drawio domain: draw.gymb.souveredu.ch - # No internal FQDN/cert for drawio yet — proxy by IP. Combined - # with serversTransport `insecureSkipVerify` (handled by the - # selfsigned-mode branch in the template), or accept the route's - # 500 until the cert is wired up. + backend_host: draw.int.gymb.souveredu.ch + port: 443 + protocol: https + - name: send + domain: send.gymb.souveredu.ch + backend_host: send.int.gymb.souveredu.ch + port: 443 + protocol: https + - name: opnform + domain: forms.gymb.souveredu.ch + backend_host: forms.int.gymb.souveredu.ch + port: 443 + protocol: https + - name: homarr + domain: home.gymb.souveredu.ch + backend_host: home.int.gymb.souveredu.ch + port: 443 + protocol: https + - name: bookstack + domain: wiki.gymb.souveredu.ch + backend_host: wiki.int.gymb.souveredu.ch port: 443 protocol: https diff --git a/inventories/demo-gymburgdorf/host_vars/storage/garage.yml b/inventories/demo-gymburgdorf/host_vars/storage/garage.yml index a6e0d79..37f4179 100644 --- a/inventories/demo-gymburgdorf/host_vars/storage/garage.yml +++ b/inventories/demo-gymburgdorf/host_vars/storage/garage.yml @@ -12,8 +12,18 @@ garage_s3_domains: garage_webui_domain: "console.s3.gymb.souveredu.ch" garage_use_ssl: true garage_webui_enabled: true +# Gate the WebUI behind authentik (admins-only, via policy-binding on the +# authentik proxy app). Replaces the htpasswd Basic-Auth — AUTH_USER_PASS +# is dropped from the compose env when this is true. The forwardauth URL +# resolves to the application-host traefik (network alias +# `auth.gymb.souveredu.ch` -> authentik-server-1 in the proxy network on +# the application host), but THIS host (storage) is in a different LAN, +# so traefik here reaches it via the public name through the DMZ proxy. +garage_webui_authentik_forward_auth: true +garage_webui_authentik_forward_auth_url: "https://auth.gymb.souveredu.ch/outpost.goauthentik.io/auth/traefik" +# Kept for completeness — only used when authentik ForwardAuth is off. garage_webui_username: "admin" -garage_webui_password: "{{ _garage.webui_password }}" +garage_webui_password: "{{ _garage.webui_password | default('disabled') }}" garage_rpc_secret: "{{ _garage.rpc_secret }}" garage_admin_token: "{{ _garage.admin_token }}" diff --git a/inventories/demo-gymburgdorf/host_vars/storage/traefik.yml b/inventories/demo-gymburgdorf/host_vars/storage/traefik.yml index 6cd16f5..e704bc3 100644 --- a/inventories/demo-gymburgdorf/host_vars/storage/traefik.yml +++ b/inventories/demo-gymburgdorf/host_vars/storage/traefik.yml @@ -1,4 +1,12 @@ --- +# Local traefik needs to reach authentik for the ForwardAuth subrequest +# the garage-webui router fires. The public IP is unreachable from this +# subnet (no DMZ hairpin), so point auth.gymb.* directly at the +# application host where authentik runs. Without this the forwardauth +# middleware would time out and every garage-console request would 502. +traefik_extra_hosts: + - "auth.gymb.souveredu.ch:172.16.19.101" + # Services hosted on `storage` that the DMZ reverseproxy should forward # public traffic to. See application/traefik.yml for the mechanism. traefik_dmz_exposed_services: diff --git a/inventories/demo-gymburgdorf/hosts.yml b/inventories/demo-gymburgdorf/hosts.yml index 220784b..66261bc 100644 --- a/inventories/demo-gymburgdorf/hosts.yml +++ b/inventories/demo-gymburgdorf/hosts.yml @@ -45,5 +45,21 @@ all: application: authentik_outpost_ldap_servers: + hosts: + application: + + send_servers: + hosts: + application: + + opnform_servers: + hosts: + application: + + homarr_servers: + hosts: + application: + + bookstack_servers: hosts: application: \ No newline at end of file diff --git a/playbooks/site.yml b/playbooks/site.yml index 13b0a7c..2e70255 100644 --- a/playbooks/site.yml +++ b/playbooks/site.yml @@ -17,6 +17,17 @@ roles: - digitalboard.core.traefik +# Inventories without the _dmz/_backend split (e.g. demo-gymburgdorf, +# where traefik_servers groups all_servers and dmz/backend is selected +# per host via traefik_mode). The :!… intersection keeps this a no-op +# for the vagrant topology, where every traefik_servers host is already +# covered by the two plays above. +- name: Configure reverse proxies + hosts: traefik_servers:!traefik_servers_dmz:!traefik_servers_backend + become: yes + roles: + - digitalboard.core.traefik + - name: Deploy httpbin service hosts: httpbin_servers become: yes @@ -71,23 +82,17 @@ roles: - digitalboard.core.drawio -# - name: Deploy send service -# hosts: send_servers -# become: yes -# roles: -# - digitalboard.core.send - -# - name: Deploy openforms service -# hosts: openforms_servers -# become: yes -# roles: -# - digitalboard.core.openforms - -- name: Deploy opencloud service - hosts: opencloud_servers +- name: Deploy send service + hosts: send_servers become: yes roles: - - digitalboard.core.opencloud + - digitalboard.core.send + +- name: Deploy opnform service + hosts: opnform_servers + become: yes + roles: + - digitalboard.core.opnform - name: Deploy homarr service hosts: homarr_servers @@ -95,8 +100,14 @@ roles: - digitalboard.core.homarr -- name: Deploy opnform service - hosts: opnform_servers +- name: Deploy bookstack service + hosts: bookstack_servers become: yes roles: - - digitalboard.core.opnform + - digitalboard.core.bookstack + +- name: Deploy opencloud service + hosts: opencloud_servers + become: yes + roles: + - digitalboard.core.opencloud diff --git a/scripts/bao-seed.sh b/scripts/bao-seed.sh new file mode 100755 index 0000000..80945b7 --- /dev/null +++ b/scripts/bao-seed.sh @@ -0,0 +1,191 @@ +#!/usr/bin/env bash +# +# Seed OpenBao secrets for a demo inventory. Merge semantics: existing +# keys at a given path are kept; only missing keys are generated. OIDC +# client secrets are synced between /data/authentik and the +# per-service secret so authentik and the service agree on the same +# value. +# +# Usage: +# scripts/bao-seed.sh demo-gymburgdorf +# +# Requirements: +# - bao CLI in $PATH, authenticated (make bao) +# - jq in $PATH +# - openssl in $PATH +# +# Environment overrides: +# BAO_ADDR default https://bao.digitalboard.ch +# DRY_RUN=1 print what would change without writing + +set -eu + +MOUNT="${1:-}" +if [[ -z "$MOUNT" ]]; then + echo "usage: $0 (e.g. demo-gymburgdorf)" >&2 + exit 1 +fi + +: "${BAO_ADDR:=https://bao.digitalboard.ch}" +export BAO_ADDR + +for cmd in bao jq openssl mktemp; do + command -v "$cmd" >/dev/null || { echo "missing: $cmd" >&2; exit 1; } +done + +if ! bao token lookup >/dev/null 2>&1; then + echo "not authenticated — run 'make bao' first" >&2 + exit 1 +fi + +DRY_RUN="${DRY_RUN:-0}" + +WORKDIR="$(mktemp -d)" +trap 'rm -rf "$WORKDIR"' EXIT + +# Read current data for a KV-v2 secret into $WORKDIR/.json. +# Writes {} if the secret does not exist or has no data yet. +read_secret() { + local path="$1" + local out="$WORKDIR/$path.json" + local raw="$WORKDIR/$path.raw" + + if bao kv get -format=json "$MOUNT/$path" >"$raw" 2>/dev/null; then + if jq -e -c '.data.data // {}' "$raw" >"$out" 2>/dev/null; then + return 0 + fi + fi + echo '{}' >"$out" +} + +# Write the in-memory secret back to bao using `bao kv put @file` so +# values containing `=` (e.g. base64 padding) survive intact. +write_secret() { + local path="$1" + local src="$WORKDIR/$path.json" + + if [[ "$DRY_RUN" == "1" ]]; then + echo " [dry-run] would write $MOUNT/$path" + return + fi + bao kv put "$MOUNT/$path" "@$src" >/dev/null +} + +# Print the current value of a key from a secret file, empty string if +# absent. +get_key() { + local file="$1" + local key="$2" + jq -r --arg k "$key" '.[$k] // ""' "$file" +} + +# Set a key inside the secret file to the given literal value. +set_key() { + local file="$1" + local key="$2" + local value="$3" + local tmp="$file.tmp" + jq --arg k "$key" --arg v "$value" '.[$k] = $v' "$file" >"$tmp" + mv "$tmp" "$file" +} + +# Generate a key in the secret file if missing. +ensure_key() { + local path="$1" + local key="$2" + local generator="$3" + local file="$WORKDIR/$path.json" + + if [[ -n "$(get_key "$file" "$key")" ]]; then + return + fi + local value + value="$($generator)" + set_key "$file" "$key" "$value" + echo " + $key" +} + +# Force a key on path2 to match path1's value for the named key. +# Prints when a change happens. +sync_key_from() { + local src_path="$1" + local src_key="$2" + local dst_path="$3" + local dst_key="$4" + + local src_file="$WORKDIR/$src_path.json" + local dst_file="$WORKDIR/$dst_path.json" + + local src_val + src_val="$(get_key "$src_file" "$src_key")" + if [[ -z "$src_val" ]]; then + echo " ! $src_path/$src_key missing — cannot sync to $dst_path/$dst_key" >&2 + return + fi + + local dst_val + dst_val="$(get_key "$dst_file" "$dst_key")" + if [[ "$src_val" == "$dst_val" ]]; then + return + fi + set_key "$dst_file" "$dst_key" "$src_val" + echo " = $dst_key (synced from $src_path/$src_key)" +} + +gen_hex32() { openssl rand -hex 32; } +gen_hex64() { openssl rand -hex 64; } +gen_pass() { openssl rand -base64 24 | tr -d '/+='; } +gen_long_pass() { openssl rand -base64 32 | tr -d '/+='; } +gen_app_key() { echo "base64:$(openssl rand -base64 32)"; } + +# OpnForm requires: min 8 chars, a letter, a digit, AND one of the +# special chars @$!%*#?&-_+=.,:;<>^()[]{}|~. The base64 generator alone +# only produces [A-Za-z0-9], so we append a fixed-position special char +# and digit to guarantee the rule passes regardless of entropy outcome. +gen_opnform_pass() { echo "$(openssl rand -base64 18 | tr -d '/+=')!1Aa"; } + +# ---------------------------------------------------------------- main + +echo "==> seeding $MOUNT (dry-run=$DRY_RUN)" + +echo "-> authentik" +read_secret authentik +ensure_key authentik secret_key gen_hex64 +ensure_key authentik postgres_password gen_pass +ensure_key authentik admin_password gen_pass +ensure_key authentik ldap_outpost_token gen_hex32 +ensure_key authentik nextcloud_oidc_secret gen_hex32 +ensure_key authentik opnform_oidc_secret gen_hex32 +ensure_key authentik homarr_oidc_secret gen_hex32 +ensure_key authentik bookstack_oidc_secret gen_hex32 +write_secret authentik + +echo "-> opnform" +read_secret opnform +ensure_key opnform app_key gen_app_key +ensure_key opnform jwt_secret gen_hex32 +ensure_key opnform front_api_secret gen_hex32 +ensure_key opnform db_password gen_long_pass +ensure_key opnform admin_password gen_opnform_pass +ensure_key opnform oidc_client_secret gen_hex32 +sync_key_from authentik opnform_oidc_secret opnform oidc_client_secret +write_secret opnform + +echo "-> homarr" +read_secret homarr +ensure_key homarr secret_encryption_key gen_hex32 +ensure_key homarr admin_password gen_pass +ensure_key homarr oidc_client_secret gen_hex32 +sync_key_from authentik homarr_oidc_secret homarr oidc_client_secret +write_secret homarr + +echo "-> bookstack" +read_secret bookstack +ensure_key bookstack db_root_password gen_long_pass +ensure_key bookstack db_password gen_long_pass +ensure_key bookstack admin_password gen_pass +ensure_key bookstack oidc_client_secret gen_hex32 +sync_key_from authentik bookstack_oidc_secret bookstack oidc_client_secret +write_secret bookstack + +echo "==> done"