docs: add architecture section and overhaul top-level README
- Move Simon's architecture documentation into architecture/ (setup, variables, topology, dns, deploy, security, operations plus index and glossary). All cross-repo references point at https://git.digitalboard.ch/Digitalboard/{reference-ansible,dns-zones} via absolute URLs so the docs remain navigable from any context. - Rewrite README.md as a documentation hub: introduction, platform Mermaid overview, comparison of the three repos (docs / digitalboard.core / reference-ansible) and a full table of contents covering architecture, contributing, infrastructure, keycloak, ms-entra and troubleshooting. Addresses the open items from the WKS PoC review (2026-05-26): docs README begrüssungstext + Übersichtsgrafik + Verlinkung der beiden anderen Repos, sowie das Verschieben der Architektur-Doku.
This commit is contained in:
parent
8c2ea8cc72
commit
345cf4b319
9 changed files with 742 additions and 27 deletions
123
architecture/dns.md
Normal file
123
architecture/dns.md
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
<!-- markdownlint-disable MD013 MD060 MD051 -->
|
||||
# DNS topology and ACME zone layout
|
||||
|
||||
← Back to [Architecture index](README.md)
|
||||
|
||||
Authoritative DNS for everything described in this document runs on
|
||||
**`ns1.digitalboard.ch`** (public `193.43.183.169`, DMZ `172.16.9.169`)
|
||||
using **Knot DNS**. The zone files and Knot config live in the
|
||||
[`dns-zones`](https://git.digitalboard.ch/Digitalboard/dns-zones) repo; this section explains how the
|
||||
public service FQDNs, the internal "split-horizon" FQDNs, and the ACME
|
||||
challenge sub-trees fit together.
|
||||
|
||||
## Authoritative zones on `ns1`
|
||||
|
||||
| Zone | Purpose | DNSSEC | Dynamic updates |
|
||||
|---|---|---|---|
|
||||
| `digitalboard.ch` | Production zone for the platform itself (`auth`, `cloud`, `office`, `bao`, …). | on | none (static zone file) |
|
||||
| `_acme.digitalboard.ch` | Parent zone for ACME challenge labels. | on | yes, per-tenant TSIG ACLs (`demo-gymb`, `demo-phbe`, `demo-mbaz`) |
|
||||
| `digitalboard._acme.digitalboard.ch` | **Delegated** child zone for `digitalboard.ch` ACME updates only. | off | yes, TSIG `acme_update_key_digitalboard` |
|
||||
| `souveredu.ch` | Demo-tenant zone (`gymb`, `phbe`, `mbaz` sub-labels). | on | none (static zone file) |
|
||||
| `demo-schulen.ch` | Reserve / unused so far. | on | none |
|
||||
|
||||
> **Two different ACME models live here.** This is the most common
|
||||
> source of confusion when copying a tenant:
|
||||
>
|
||||
> - `digitalboard.ch` uses a **NS-delegated child zone**
|
||||
> (`digitalboard._acme.digitalboard.ch.` has its own `NS` record in
|
||||
> `_acme.digitalboard.ch`). The TSIG key writes into that delegated
|
||||
> zone.
|
||||
> - The demo tenants (`demo-gymb`, `demo-phbe`, `demo-mbaz`) **share
|
||||
> the parent zone** `_acme.digitalboard.ch` and are isolated only
|
||||
> by **Knot ACL `update-owner-name`** on the per-tenant sub-tree
|
||||
> (`demo-gymb._acme.digitalboard.ch.` and below). There is no NS
|
||||
> delegation for them.
|
||||
>
|
||||
> Both work for the ACME flow; the demo model is cheaper to manage but
|
||||
> means tenant isolation depends on Knot ACLs, not zone boundaries.
|
||||
|
||||
## Naming pattern for `demo-gymb` (template for new tenants)
|
||||
|
||||
```text
|
||||
Public, browser-facing:
|
||||
cloud.gymb.souveredu.ch CNAME → rvp.gymb.souveredu.ch (193.43.183.131)
|
||||
auth.gymb.souveredu.ch CNAME → rvp.gymb.souveredu.ch
|
||||
office.gymb.souveredu.ch CNAME → rvp.gymb.souveredu.ch
|
||||
s3.gymb.souveredu.ch CNAME → rvp.gymb.souveredu.ch
|
||||
...
|
||||
|
||||
Internal, server-to-server (split horizon):
|
||||
cloud.int.gymb.souveredu.ch A → 172.16.19.101 (application host)
|
||||
auth.int.gymb.souveredu.ch A → 172.16.19.101
|
||||
office.int.gymb.souveredu.ch A → 172.16.19.101
|
||||
s3.int.gymb.souveredu.ch A → 172.16.19.102 (storage host)
|
||||
...
|
||||
|
||||
Tenant entry IPs:
|
||||
rvp.gymb.souveredu.ch A → 193.43.183.131 (DMZ Traefik public)
|
||||
reverseproxy.int.gymb A → 172.16.9.111 (DMZ Traefik internal)
|
||||
|
||||
ACME challenge labels (writeable via TSIG acme_update_key_demo_gymb):
|
||||
_acme-challenge.cloud.gymb CNAME → cloud.demo-gymb._acme.digitalboard.ch
|
||||
_acme-challenge.cloud.int.gymb CNAME → cloud.int.demo-gymb._acme.digitalboard.ch
|
||||
...
|
||||
```
|
||||
|
||||
The `.int.` family is what makes Nextcloud → Garage, Nextcloud →
|
||||
Authentik (OIDC), Nextcloud → Collabora (WOPI) etc. **bypass the DMZ
|
||||
Traefik**: the backend host's local Traefik presents the right cert
|
||||
directly, so traffic stays on the backend subnet. Without this,
|
||||
server-to-server calls would either ride out through the DMZ and back
|
||||
in, or hit a hostname mismatch on the cert.
|
||||
|
||||
## TSIG / ACL model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
classDef tenant fill:#dcfce7,stroke:#166534,color:#000
|
||||
classDef zone fill:#dbeafe,stroke:#1e40af,color:#000
|
||||
classDef acl fill:#fef3c7,stroke:#92400e,color:#000
|
||||
|
||||
subgraph KNOT["ns1.digitalboard.ch (Knot DNS)"]
|
||||
Z1["_acme.digitalboard.ch<br/>(parent zone)"]:::zone
|
||||
Z2["digitalboard._acme.digitalboard.ch<br/>(NS-delegated child)"]:::zone
|
||||
A1["ACL acme_updates_digitalboard<br/>scope: digitalboard._acme.digitalboard.ch."]:::acl
|
||||
A2["ACL acme_updates_demo_gymb<br/>scope: demo-gymb._acme.digitalboard.ch."]:::acl
|
||||
A3["ACL acme_updates_demo_phbe<br/>scope: demo-phbe._acme.digitalboard.ch."]:::acl
|
||||
A4["ACL acme_updates_demo_mbaz<br/>scope: demo-mbaz._acme.digitalboard.ch."]:::acl
|
||||
end
|
||||
|
||||
DB["digitalboard.ch Traefik<br/>TSIG: acme_update_key_digitalboard"]:::tenant
|
||||
GY["demo-gymb Traefik<br/>TSIG: acme_update_key_demo_gymb"]:::tenant
|
||||
PH["demo-phbe Traefik<br/>TSIG: acme_update_key_demo_phbe"]:::tenant
|
||||
MB["demo-mbaz Traefik<br/>TSIG: acme_update_key_demo_mbaz"]:::tenant
|
||||
|
||||
DB -- nsupdate TXT --> A1
|
||||
GY -- nsupdate TXT --> A2
|
||||
PH -- nsupdate TXT --> A3
|
||||
MB -- nsupdate TXT --> A4
|
||||
A1 -- writes into --> Z2
|
||||
A2 -- writes into --> Z1
|
||||
A3 -- writes into --> Z1
|
||||
A4 -- writes into --> Z1
|
||||
```
|
||||
|
||||
Each ACL is restricted to **`update-type: TXT`** and
|
||||
**`update-owner-match: sub-or-equal`** under the tenant prefix, so a
|
||||
leaked tenant key cannot write outside its own ACME sub-tree and cannot
|
||||
modify non-TXT records (no A/CNAME/NS hijack).
|
||||
|
||||
## Traefik variables that bind to this layout
|
||||
|
||||
From `inventories/demo-gymburgdorf/group_vars/traefik_servers/traefik.yml`:
|
||||
|
||||
| Traefik variable | Value for `demo-gymb` | Bound to |
|
||||
|---|---|---|
|
||||
| `traefik_acme_dns_provider` | `rfc2136` | Knot dynamic-update endpoint |
|
||||
| `traefik_acme_dns_zone` | `demo-gymb._acme.digitalboard.ch` | Per-tenant write scope on `ns1` |
|
||||
| `traefik_acme_tsig_key_name` | `acme_update_key_demo_gymb` | Matches `key:` entry in [`knot.conf`](https://git.digitalboard.ch/Digitalboard/dns-zones/src/branch/main/knot/knot.conf) |
|
||||
| `traefik_acme_tsig_secret` | Bao lookup | See [security.md](security.md) |
|
||||
|
||||
A tenant whose ACME zone does **not** match the Knot ACL
|
||||
`update-owner-name` will get `REFUSED` on `nsupdate` and ACME issuance
|
||||
will silently retry until the renewal window expires.
|
||||
Loading…
Add table
Add a link
Reference in a new issue