docs(reference-ansible): add docs/ tree and document repo, playbooks, Makefile

Addresses the WKS PoC review (Notion 2026-05-26). All docs in English.
- README: purpose, docs table of contents, annotated repo tree
- docs/getting_started.md: prerequisites (WKS account, OIDC, SSH, VPN) + first deploy
- docs/ansible.md: playbook table, "Running Ansible", service parameters, cheatsheet
- docs/secrets.md: canonical Bao login (moved out of README) + demo defaults
- docs/operations.md: full Makefile reference
- docs/inventories.md: repo layout, topology, standard folder structure, walkthrough
- docs/testing.md: static checks, inventory resolution, smoke test / dry run
- remove ARCHITECTURE.md (architecture docs live externally)

Also includes the gymburgdorf inventory build-out (bookstack, homarr,
opnform, send) and scripts/bao-seed.sh. site.yml keeps a third traefik
play (traefik_servers minus the vagrant _dmz/_backend split) so the demo
inventories still configure their reverse proxy after the rebase onto main.
This commit is contained in:
Simon Bärlocher 2026-05-27 18:08:52 +02:00
parent c67e9aac43
commit 2ba0c07cd3
No known key found for this signature in database
GPG key ID: 63DE20495932047A
24 changed files with 1541 additions and 525 deletions

136
docs/operations.md Normal file
View file

@ -0,0 +1,136 @@
<!-- markdownlint-disable MD013 -->
# Setup & operations
[← Documentation index](README.md)
## Prerequisites (control node)
- `ansible` (Core ≥ 2.15)
- `bao` CLI ([OpenBao](https://openbao.org/)) — e.g.
`sudo pacman -S openbao python-hvac` (Arch) or via Homebrew
- `python-hvac` (for `community.hashi_vault` lookups)
- On macOS: `OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES` (set in the
[Makefile](../Makefile); without this env var, Ansible forks crash
on the first `community.hashi_vault` lookup, because the
Objective-C runtime is not fork-safe)
## Initial setup
```bash
git clone <repo>
cd reference-ansible
make install # Galaxy + digitalboard.core into ./collections/
```
`make install` installs `community.hashi_vault` and the
`digitalboard.core` collection (Git, see [requirements.yml](../requirements.yml))
into `./collections/`. There is **no** `roles/` directory in the
repo root — all roles come from the collection, see
[inventories.md § Repo layout](inventories.md#repo-layout-and-role-origin).
## Secrets (OpenBao)
Before every deploy, authenticate to OpenBao in the **same shell**. The
full login flow, the `make bao` caveat, the lookup pattern, and
tenant isolation are documented in **[secrets.md](secrets.md)**.
## Smoke test
```bash
make ping_demo # pings all three demo inventories (ping module)
```
## Deploy
```bash
make deploy_site_demo_gymburgdorf # single demo site
make deploy_site_demo_mbazürich
make deploy_site_demo_phbern
make deploy_site_demo # all three in sequence
```
`--diff` is set in the Gymburgdorf target → changes are visible per task.
The play order and which plays run as no-ops:
see [ansible.md § Playbooks](ansible.md#playbook-siteyml).
## Makefile reference
The [Makefile](../Makefile) bundles setup, secret handling, and deploy.
It defines no variables for passing in except `DRY_RUN` (for the
`seed_bao_*` targets) — control is via the chosen target.
### Exported env vars (apply to all targets)
| Variable | Value | Purpose |
| --- | --- | --- |
| `BAO_ADDR` | `https://bao.digitalboard.ch` | OpenBao endpoint for `bao` and `community.hashi_vault` calls |
| `OBJC_DISABLE_INITIALIZE_FORK_SAFETY` | `YES` | macOS fork safety: without this var, Ansible forks crash on the first `hashi_vault` lookup, because the Objective-C runtime is not fork-safe |
> Both are set via `export` at the top of the Makefile and thus
> inherited by every target shell process — regardless of which target runs.
### Setup & secrets
| Target | Effect |
| --- | --- |
| `make install` | `ansible-galaxy collection install -r requirements.yml -p collections` — installs `community.hashi_vault` + `digitalboard.core` into `./collections/` |
| `make bao` | `bao login -method=oidc -path=Digitalboard role=default` + sets `VAULT_TOKEN` via `$(eval …)`. ⚠️ The token only lives **within this single `make` invocation** — see caveat below |
| `make seed_bao_gymburgdorf` | Seed/merge OpenBao secrets for `demo-gymburgdorf` via [scripts/bao-seed.sh](../scripts/bao-seed.sh). Idempotent: existing keys remain, only missing ones are generated |
| `make seed_bao_mbazürich` | same for `demo-mbazürich` |
| `make seed_bao_phbern` | same for `demo-phbern` |
> The `seed_bao_*` targets understand `DRY_RUN=1` — shows the diff without
> writing: `make seed_bao_gymburgdorf DRY_RUN=1`. Requirement:
> `bao`, `jq`, `openssl` in `$PATH` and a valid `VAULT_TOKEN`.
### Smoke test & deploy
| Target | Effect |
| --- | --- |
| `make ping_demo` | `ansible … -m ping` against all three demo inventories in sequence; failures of individual hosts do not abort (`\|\| true`) |
| `make deploy_site_demo_gymburgdorf` | `ansible-playbook playbooks/site.yml -i …/demo-gymburgdorf/hosts.yml --diff` |
| `make deploy_site_demo_mbazürich` | same for `demo-mbazürich`**without** `--diff` |
| `make deploy_site_demo_phbern` | same for `demo-phbern`**without** `--diff` |
| `make deploy_site_demo` | calls the three `deploy_site_demo_*` targets in sequence |
> **Inconsistency:** Only the Gymburgdorf target sets `--diff`. For
> `mbazürich` and `phbern` you do not see the task changes — if
> needed, invoke directly with `ansible-playbook … --diff`, see
> [ansible.md § Running Ansible](ansible.md#running-ansible).
### Token caveat (`make bao`)
`make bao` alone is **not** enough for a deploy: each `make` target
runs in its own shell, the `VAULT_TOKEN` set there only lives
during `make bao` itself and is already gone in the next `make deploy_…`.
Two working approaches:
```bash
# Variant A — log in manually in the active shell (survives multiple make invocations)
export BAO_ADDR=https://bao.digitalboard.ch
bao login -method=oidc -path=Digitalboard
export VAULT_TOKEN=$(bao print token)
make deploy_site_demo_gymburgdorf
# Variant B — chain both as ONE make invocation (token lives for the chain)
make bao deploy_site_demo_gymburgdorf
```
Login details and the secret pattern: [secrets.md](secrets.md#openbao-login).
## Known gaps and trade-offs
- **`opencloud` in `demo-gymburgdorf`:** Play present, but no
`opencloud_servers` group — runs as a no-op. If needed, add a group +
`host_vars`, see [inventories.md](inventories.md#walkthrough-creating-a-new-demo-tenant).
- **`turn` host:** defined in the DMZ, but no STUN/TURN role in
[playbooks/site.yml](../playbooks/site.yml) — provisioned only via `base` +
`traefik`.
- **Idempotency:** Roles are Docker-Compose-based; re-runs can
trigger container restarts when Compose inputs change. No
rollback mechanism — on failure, roll back manually.
- **TLS renewal:** handled internally by Traefik via ACME, no external
renew cron in the repo.
- **CI/testing:** currently not in the repo; smoke test via `make ping_demo`.
- **Logging:** `traefik_log_level: DEBUG` in `demo-gymburgdorf` and
`vagrant` (role default `INFO`) — reduce before adapting to prod.