docs: add architecture section and overhaul top-level README
- Move Simon's architecture documentation into architecture/ (setup, variables, topology, dns, deploy, security, operations plus index and glossary). All cross-repo references point at https://git.digitalboard.ch/Digitalboard/{reference-ansible,dns-zones} via absolute URLs so the docs remain navigable from any context. - Rewrite README.md as a documentation hub: introduction, platform Mermaid overview, comparison of the three repos (docs / digitalboard.core / reference-ansible) and a full table of contents covering architecture, contributing, infrastructure, keycloak, ms-entra and troubleshooting. Addresses the open items from the WKS PoC review (2026-05-26): docs README begrüssungstext + Übersichtsgrafik + Verlinkung der beiden anderen Repos, sowie das Verschieben der Architektur-Doku.
This commit is contained in:
parent
8c2ea8cc72
commit
345cf4b319
9 changed files with 742 additions and 27 deletions
99
architecture/operations.md
Normal file
99
architecture/operations.md
Normal file
|
|
@ -0,0 +1,99 @@
|
|||
<!-- markdownlint-disable MD013 MD060 MD051 -->
|
||||
# Operations — new tenants and known gaps
|
||||
|
||||
← Back to [Architecture index](README.md)
|
||||
|
||||
## 10. Walkthrough: creating a new demo tenant
|
||||
|
||||
Recommended template: **`demo-gymburgdorf`** (not `vagrant`, since its
|
||||
group topology is incompatible).
|
||||
|
||||
1. **Copy the inventory:**
|
||||
|
||||
```bash
|
||||
cp -r inventories/demo-gymburgdorf inventories/demo-<customer>
|
||||
```
|
||||
|
||||
2. **Adjust `hosts.yml`:** IPs and hostnames per host.
|
||||
|
||||
3. **`group_vars/all/vault.yml`** — point `vault_mount` at the new
|
||||
tenant mount (`demo-<customer>`).
|
||||
|
||||
4. **`group_vars/traefik_servers/traefik.yml`** — bend
|
||||
`traefik_acme_dns_zone` and the `traefik_acme_tsig_*` lookup paths
|
||||
to the new zone / new Bao path.
|
||||
|
||||
5. **`host_vars/application/*.yml`** and
|
||||
**`host_vars/storage/*.yml`** — walk through them: FQDNs to the new
|
||||
domain pattern (e.g. `*.<customer>.souveredu.ch`), Bao lookup paths
|
||||
to `demo-<customer>/data/…`.
|
||||
|
||||
6. **Prepare OpenBao** (out-of-band, not via Ansible):
|
||||
- Create a new KV-v2 mount `demo-<customer>`.
|
||||
- Write secrets: `acme-tsig`, `authentik`, `nextcloud`, `garage`, …
|
||||
(see [security.md](security.md) for the mandatory-override list).
|
||||
- Policy for the deploy token: read on `demo-<customer>/data/*`.
|
||||
|
||||
7. **DNS** (in the [`dns-zones`](https://git.digitalboard.ch/Digitalboard/dns-zones) repo, see
|
||||
[dns.md](dns.md)):
|
||||
- Add `key:` and `acl:` entries for the new tenant in
|
||||
[`knot/knot.conf`](https://git.digitalboard.ch/Digitalboard/dns-zones/src/branch/main/knot/knot.conf), pattern
|
||||
`acme_update_key_demo_<customer>` /
|
||||
`acme_updates_demo_<customer>` scoped to
|
||||
`demo-<customer>._acme.digitalboard.ch.`.
|
||||
- Append the new ACL to the `_acme.digitalboard.ch` zone's `acl:`
|
||||
list — the tenants share the parent zone, no NS delegation.
|
||||
- In `zones/souveredu.ch.zone` (or the tenant's public zone) add
|
||||
the public/internal A records (`rvp.<customer>`,
|
||||
`reverseproxy.int.<customer>`, `application.int.<customer>`,
|
||||
`storage.int.<customer>`, …), the service CNAMEs to
|
||||
`rvp.<customer>`, and the `_acme-challenge.*` CNAMEs into
|
||||
`demo-<customer>._acme.digitalboard.ch`. Bump the SOA serial.
|
||||
- `make deploy_ns1` to push.
|
||||
|
||||
8. **Makefile** — add a new target modelled on
|
||||
`deploy_site_demo_gymburgdorf` and wire it into
|
||||
`deploy_site_demo`.
|
||||
|
||||
9. **Smoke test:**
|
||||
`ansible all -i inventories/demo-<customer>/hosts.yml -m ping`.
|
||||
|
||||
10. **Deploy:** Bao login + `make deploy_site_demo_<customer>`.
|
||||
|
||||
## 11. Known gaps and trade-offs
|
||||
|
||||
- **Optional services without group bindings in `demo-gymburgdorf`:**
|
||||
`opencloud`, `send`, `opnform`, `homarr`, and `bookstack` are
|
||||
declared as plays in
|
||||
[playbooks/site.yml](https://git.digitalboard.ch/Digitalboard/reference-ansible/src/branch/main/playbooks/site.yml) but have no
|
||||
`<service>_servers` group in the inventory — those plays run as
|
||||
no-ops. If needed, add the group + `host_vars/application/<svc>.yml`
|
||||
as described in [topology.md](topology.md). Mind spelling:
|
||||
`opnform_servers` (not `openform`/`openforms`).
|
||||
- **`turn` host:** defined in the DMZ, but no STUN/TURN role in
|
||||
[playbooks/site.yml](https://git.digitalboard.ch/Digitalboard/reference-ansible/src/branch/main/playbooks/site.yml). Currently provisioned only
|
||||
via `base` + `traefik`.
|
||||
- **Idempotency:** roles are Docker-Compose-based; re-runs may trigger
|
||||
container restarts when compose inputs change. There is no dedicated
|
||||
rollback mechanism — on failure, roll back manually to the previous
|
||||
state.
|
||||
- **TLS renewal:** handled internally by Traefik via ACME. There is no
|
||||
external renewal cron in the repo.
|
||||
- **CI / testing:** not present in the repo. Smoke test is
|
||||
`make ping_demo`.
|
||||
- **Logs:** Traefik runs with `traefik_log_level: DEBUG` in
|
||||
`demo-gymburgdorf` and `vagrant` (role default is `INFO`) — reduce
|
||||
to `INFO` or `WARN` before adapting for production.
|
||||
- **TSIG secrets in `knot.conf`:** the `dns-zones` repo currently
|
||||
stores all four ACME TSIG keys in plaintext in
|
||||
[`knot/knot.conf`](https://git.digitalboard.ch/Digitalboard/dns-zones/src/branch/main/knot/knot.conf). The Ansible
|
||||
side reads them from Bao, but the Knot side does not — anyone with
|
||||
read on the `dns-zones` repo can write TXT records under the
|
||||
matching tenant's ACME sub-tree. For prod, source the Knot keys
|
||||
from a templated config + secret store, or restrict repo access.
|
||||
- **Demo tenants share `_acme.digitalboard.ch`:** isolation is by
|
||||
Knot ACL `update-owner-name`, not by zone delegation. A mis-edit
|
||||
of the ACL list could break ACL-based isolation without breaking
|
||||
DNS resolution — failure is silent. The production zone
|
||||
(`digitalboard.ch`) uses a properly delegated child zone and is
|
||||
not affected.
|
||||
Loading…
Add table
Add a link
Reference in a new issue