lomavuokraus/docs/infra.html
Tero Halla-aho 0bb709d9c5
Some checks failed
CI / checks (push) Has been cancelled
chore: fix audit alerts and formatting
2026-02-04 12:43:03 +02:00

225 lines
7.4 KiB
HTML

<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Infrastructure</title>
<link rel="stylesheet" href="./style.css" />
</head>
<body>
<header>
<h1>Infrastructure Overview</h1>
<div class="meta">
Hetzner k3s cluster, Traefik ingress, cert-manager TLS, private
registry, staging/prod namespaces.
</div>
</header>
<main class="grid">
<section class="card">
<h2>Traffic flow</h2>
<div class="diagram">
<pre class="mermaid">
flowchart LR
DNS["lomavuokraus.fi\nstaging.lomavuokraus.fi\napi.lomavuokraus.fi"] --> Traefik["Traefik ingress\n(class: traefik)"]
User["User browser"] -->|"HTTPS"| Traefik
CertMgr["cert-manager\nletsencrypt prod/staging"] -->|"TLS"| Traefik
subgraph Cluster["k3s hel1 cx23 (157.180.66.64)"]
Traefik --> Service["Service :80 -> 8080"]
Service --> Varnish["Varnish cache\n(static + /api/images/*)"]
Varnish --> Pod["Next.js pods (2)\n(port 3000)"]
Pod --> DB["PostgreSQL 46.62.203.202"]
Pod --> SMTP["smtp.lomavuokraus.fi"]
Secret["Secret: lomavuokraus-web-secrets"]
CM["ConfigMap: lomavuokraus-web-config"]
end
Registry["registry.halla-aho.net/thalla/lomavuokraus-web"] -->|"pull"| Pod
</pre>
</div>
<div class="callout">
Mermaid renders directly in the browser; edit the graph in this file
to update.
</div>
</section>
<section class="card">
<h2>Hetzner nodes</h2>
<div class="diagram">
<pre class="mermaid">
flowchart TB
Users["Users"] -->|"HTTPS"| K3s["Node A: k3s (hel1 cx23)\nTraefik + cert-manager"]
subgraph HetznerCloud["Hetzner Cloud"]
K3s
DB["Node B: Postgres VM\n46.62.203.202"]
end
subgraph Prod["Prod namespace"]
Prod1["Next.js pod #1 (prod)"]
Prod2["Next.js pod #2 (prod)"]
end
subgraph Staging["Staging namespace"]
Stg1["Next.js pod #1 (staging)"]
Stg2["Next.js pod #2 (staging)"]
end
K3s --> Prod1
K3s --> Stg1
Prod1 --> DB
Prod2 --> DB
Stg1 --> DB
Stg2 --> DB
</pre>
</div>
</section>
<section class="card">
<h2>Cluster &amp; Namespaces</h2>
<ul>
<li>
Single-node k3s (Hetzner hel1 cx23) at <code>157.180.66.64</code>.
</li>
<li>
Namespaces: <code>lomavuokraus-prod</code>,
<code>lomavuokraus-staging</code>, <code>lomavuokraus-test</code>.
</li>
<li>Ingress controller: Traefik (k3s default).</li>
<li>
cert-manager v1.15.3 with ClusterIssuers:
<ul>
<li><code>letsencrypt-prod</code> (ACME prod)</li>
<li>
<code>letsencrypt-staging</code> (ACME staging for test certs)
</li>
</ul>
</li>
<li>
Service points to a Varnish sidecar (port 8080) in each pod before
the Next.js container (3000) to cache <code>/api/images/*</code> and
static assets.
</li>
<li>
Cache policy: images cached 24h with
<code>Cache-Control: public, max-age=86400, immutable</code>;
<code>_next/static</code> cached 7d; non-GET traffic and health
checks bypass cache.
</li>
<li>
DNS: <code>lomavuokraus.fi</code>,
<code>staging.lomavuokraus.fi</code>,
<code>api.lomavuokraus.fi</code> -> cluster IP.
</li>
</ul>
</section>
<section class="card">
<h2>Registry</h2>
<ul>
<li>
Private registry:
<code>registry.halla-aho.net/thalla/lomavuokraus-web</code>.
</li>
<li>
Credentials stored outside repo (<code>creds/</code>), image pull
secret <code>registry-halla</code> in staging/prod namespaces.
</li>
<li>
Images tagged with git SHA-derived numeric tag and
<code>:latest</code>.
</li>
</ul>
</section>
<section class="card">
<h2>App Manifests</h2>
<ul>
<li>
<code>k8s/app.yaml</code> templated via envsubst in deploy scripts.
</li>
<li>
Objects:
<ul>
<li>
ConfigMap: <code>lomavuokraus-web-config</code> (public env).
</li>
<li>
Deployment: 2 replicas, Varnish sidecar on port 8080 in front of
the Next.js container (3000), liveness/readiness on
<code>/api/health</code> via Varnish.
</li>
<li>
Service: ClusterIP on port 80 targeting the Varnish container.
</li>
<li>
Ingress: Traefik class, TLS via cert-manager, HTTPS redirect
middleware.
</li>
<li>
Traefik Middleware: <code>https-redirect</code> to force HTTPS.
</li>
</ul>
</li>
<li>
Secrets: <code>lomavuokraus-web-secrets</code> in cluster (not in
repo).
</li>
</ul>
</section>
<section class="card">
<h2>Runtime Environment</h2>
<ul>
<li>
Next.js 14.2.33 (App Router) running via Node.js 20 in Docker.
</li>
<li>
PostgreSQL at <code>46.62.203.202</code>; staging/prod DB
<code>lomavuokraus</code> is clean with only the seeded admin,
testing DB <code>lomavuokraus_testing</code> holds the previous
data. Schema snapshot tracked in <code>docs/db-schema.sql</code>.
</li>
<li>
SMTP: smtp.lomavuokraus.fi (CNAME to smtp.sohva.org), DKIM key under
<code>creds/dkim/...</code>.
</li>
<li>
Session auth: signed JWT cookie <code>session_token</code>; roles:
USER, ADMIN, USER_MODERATOR, LISTING_MODERATOR.
</li>
</ul>
</section>
<section class="card">
<h2>Emergency shutdown</h2>
<ul>
<li>
Script: <code>scripts/emergency-shutdown.sh</code> issues Hetzner
poweroff/shutdown commands for tracked nodes (currently
<code>node1.lomavuokraus.fi</code> and
<code>db1.lomavuokraus.fi</code>).
</li>
<li>
List tracked nodes:
<code>./scripts/emergency-shutdown.sh --list</code>.
</li>
<li>
Hard stop everything:
<code
>./scripts/emergency-shutdown.sh --yes --confirm "SHUTDOWN ALL
LOMAVUOKRAUS NODES NOW"</code
>
(default action is <code>poweroff</code>; use
<code>--action shutdown</code> for ACPI).
</li>
<li>
Requires <code>hcloud</code> CLI with <code>HCLOUD_TOKEN</code> or
configured at <code>~/.config/hcloud/cli.toml</code>; keep the node
list updated when adding servers.
</li>
</ul>
</section>
</main>
<script type="module">
import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs";
mermaid.initialize({ startOnLoad: true, theme: "dark" });
</script>
</body>
</html>