From 485e3f165b6e995dbbc7f25ac351de44c1f3793d Mon Sep 17 00:00:00 2001 From: serversdown Date: Mon, 15 Jun 2026 18:27:40 +0000 Subject: [PATCH] docs: portal-auth design spec (Phase 1 password gate; operator-auth + multi-tenant deferred) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specs/2026-06-15-portal-auth-design.md | 237 ++++++++++++++++++ 1 file changed, 237 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-15-portal-auth-design.md diff --git a/docs/superpowers/specs/2026-06-15-portal-auth-design.md b/docs/superpowers/specs/2026-06-15-portal-auth-design.md new file mode 100644 index 0000000..7932e47 --- /dev/null +++ b/docs/superpowers/specs/2026-06-15-portal-auth-design.md @@ -0,0 +1,237 @@ +# Portal Authentication — Design & Build Plan + +**Status:** in development (`feat/portal-auth`) · **Targets:** 0.14.x · **Date:** 2026-06-15 + +Supersedes the interim shareable magic-link described in +[CLIENT_PORTAL.md](../../CLIENT_PORTAL.md) with a real password gate. + +## Goal + +Give a client a **secure link + password** that opens a **read-only dashboard** — +live data plus access to historical data — for the machines commissioned on +**their project**. Nothing else: no device control, no editing, no internal pages. + +This is the first real, internet-facing, client-credentialed surface in the +system. + +## Scope + +**Phase 1 (this spec — build now):** per-project, password-gated, read-only portal. + +**Deferred (designed, not built — captured below so nothing is lost):** +- **Operator auth** — logins + roles for the *internal* app (you / parents). + Full design in [Deferred A](#deferred-a--operator-auth-designed-not-built). +- **Full multi-tenancy** — per-client rollups, per-project separation within a + client, individual client user accounts, and extending the portal to all + client-relevant data. [Deferred B](#deferred-b--full-multi-tenancy). + +## Principles (the portal's standing charter) + +1. **Read-only.** A client can look, never touch. +2. **Strictly scoped, server-side.** Never trust a project / location / unit id + from the request — always re-resolve ownership. +3. **Cache-first.** Portal live data comes from SLMM's cache (the same cached + reads the internal dashboard uses). A client can never make us hit the device. +4. **The gate is a swappable seam.** Everything routes through the scoping layer + the portal already has; auth is the thin thing in front of it. + +## The model + +- **Tenant unit = the project.** Each project is its own portal: one link, one + password, showing that project's commissioned machines. +- **Shared credential — "company / project-manager wide."** No individual client + accounts. Because access is read-only, one shared password per project is an + acceptable trade. (Per-person accounts are a Deferred-B item.) +- **The link identifies the project; the password authorizes.** A password alone + can't say *which* project — so the link carries an unguessable, revocable + per-project token, and the password is the shared secret gating it. + +## Architecture + +Two layers, two subdomains (hosting target: office Synology NAS behind a UniFi +UXG Max; own domain `terra-mechanics.com`). + +``` +Internet + │ +UniFi UXG Max ── Layer 1 (IT pro): firewall, IPS/IDS, GeoIP allow-list, + │ kill-switch rule, 443 only +Synology NAS ── DSM reverse proxy + Let's Encrypt wildcard TLS + │ + ├─ terra-view.terra-mechanics.com → internal app (operator auth = Deferred A) + └─ portal.terra-mechanics.com → LOCKED to /portal/* only, password gate +``` + +The portal subdomain is **restricted to `/portal/*` at the reverse proxy** — a +client on `portal.` physically cannot reach `/roster`, `/admin/*`, etc., even by +guessing URLs. This path-lock is a load-bearing control for as long as the +internal app remains unauthenticated (until Deferred A lands). + +## Data model + +Add three columns to **`Project`**: + +| Column | Type | Purpose | +|---|---|---| +| `portal_enabled` | bool, default `false` | Is the portal open for this project. | +| `portal_password_hash` | text, nullable | argon2id hash of the shared password. Never plaintext. | +| `portal_link_token` | text, unique, nullable | Unguessable token in the secure link; identifies the project without exposing its raw id, and is revocable (regenerate → old link dies). | + +**Reused unchanged:** the `Client → Project → MonitoringLocation → +UnitAssignment → unit` scoping chain and the existing read-only scoped data +routes (`resolve_client_location` + live / history / events). + +**Migration:** `migrate_add_project_portal_auth.py` — an `ALTER TABLE` adding the +three columns to the existing (non-empty) `projects` table. Same pattern as +`migrate_add_client_portal.py`; `create_all` won't add columns to an existing +table. + +## Auth flow + +1. **Operator enables + shares.** On the project page, the operator turns the + portal on; the system generates a strong password + a `portal_link_token`; the + operator copies **link + password** to send the client. +2. **Client opens the link** `portal.terra-mechanics.com/portal/p/{link_token}` → + the project is resolved from the token → a **password prompt** renders. +3. **Client submits the password** → argon2-verified against + `portal_password_hash`. On success, a **signed session cookie scoped to that + project** is set (HMAC via the existing `SECRET_KEY` cookie machinery), and + they are redirected to the project dashboard. +4. **Subsequent requests** re-validate the cookie (signature + project still + `portal_enabled` + within cookie max-age) and serve the existing read-only + scoped data. +5. **Logout** clears the cookie. **Revoke** = disable the portal or regenerate the + token / password, which kills outstanding links and any session minted from + them on the next request. + +**Lockout:** track failed attempts (per token + IP); after 5 failures refuse for +a 15-minute cooldown. Combined with the UniFi GeoIP/IPS edge, that's solid for a +read-only surface. + +**Shared cookie machinery:** lift the portal's cookie sign/verify out of +`portal_auth.py` into a small shared `backend/auth_cookies.py` — one signer, so +the future operator auth (Deferred A) reuses it instead of copy-pasting crypto. + +### Relationship to the existing portal code + +The portal today is *client-scoped* (a `ClientAccessToken` magic-link → a cookie +covering all of a client's projects, with a `/portal` overview). Phase 1 makes the +entry point *project-scoped*: + +- The **`/portal/p/{link_token}` + password** flow becomes the way in; the + interim client magic-link (`/portal/enter/{token}`, `/portal/open/*`, + `PORTAL_OPEN_LINKS`) is **retired** in its favor. +- The existing read-only views (`/portal/location/{id}`, live / history / events) + and the scoping helper are **reused as-is**, just resolved against the project in + the session cookie instead of the client. +- `Client` / `ClientAccessToken` rows are **left in place** (no destructive + migration) — they become the substrate for the Deferred-B per-client rollup. + +## Operator "Portal access" panel + +On the project detail page (internal app), a panel that: +- Toggles `portal_enabled`. +- **Regenerate password** → shows a freshly generated strong password **once** for + the operator to copy. +- **Copy link** → the `/portal/p/{token}` URL. +- **Revoke** → regenerate the token (old link dies) and/or disable the portal. + +This is an operator action. Until operator auth lands (Deferred A), it sits behind +the same posture as the rest of the internal app — see Security notes. + +## Error handling + +- **Bad password** → generic "incorrect password" + increment fail count. +- **Unknown / disabled / revoked token** → generic "this portal link is no longer + active" page (no project-existence leak). +- **Locked out** → "too many attempts, try again in 15 minutes." +- **Expired / invalid cookie** → back to the password prompt. +- **Portal disabled after a session started** → next request bounced to the prompt. + +## Rollout + +1. Implement on `feat/portal-auth` → review → merge to `dev`. +2. **Migration** `migrate_add_project_portal_auth.py` on each DB (dev + prod), same + drill as the client-portal migration. +3. **`SECRET_KEY`** must be a real value in prod (already required for the existing + portal cookie; the password gate reuses it). +4. **Hosting:** DSM reverse proxy routes `portal.` → app, locked to `/portal/*`; + Let's Encrypt wildcard TLS; cookies `Secure` once on TLS. UXG Max GeoIP + IPS + + kill-switch handled by the IT pro. +5. Enable a real project's portal, set a password, and test the full + link → password → dashboard flow over HTTPS before sending a client. + +## Testing + +- **Unit:** argon2 hash/verify; token resolution (valid / unknown / disabled); + lockout counter; cookie sign/verify + scope check; "disabled mid-session" bounce. +- **Scoping:** a session for project A cannot read project B's locations / history + / events (404, no existence leak). +- **Manual smoke:** enable → copy link + password → open in a fresh browser → + wrong password (lockout) → right password → see live + history → logout. + +--- + +## Deferred A — Operator auth (designed, not built) + +Logins + roles for the **internal** app (`terra-view.` subdomain). Closes the +"internal app is wide open" hole. Full design, ready to lift into its own spec: + +- **Two layers:** UniFi UXG Max edge (IT-pro owned — firewall, IPS, GeoIP, + kill-switch, 443-only) + in-app auth (built by us). Internet-exposed with login + (no VPN — deliberately, to spare non-technical family members). +- **`OperatorUser` model:** `id, email (unique, lowercased), display_name, + password_hash (argon2id), role, active, created_at, last_login_at, + sessions_valid_from, failed_login_count, locked_until` (+ later `totp_secret`, + `totp_enabled`). +- **Role ladder:** `superadmin > admin > operator`. + - `superadmin` = you — everything + account management (create/disable users, + reset passwords, assign roles). + - `admin` = your parents (company owners) + you — full run of the app, no + operational restrictions. + - `operator` = **future** restricted tier for hires; the ladder accepts it with + no route changes. + - The only thing gated above plain `admin` in v1 is account management + (`superadmin`). +- **Sessions:** stateless signed cookie reusing `auth_cookies.py` + `SECRET_KEY` + (distinct cookie name from the portal). `sessions_valid_from` gives "log out + everywhere" / revoke-on-password-change with no session table. +- **Authorization:** one **deny-by-default middleware** gates the whole internal + app (exempt: `/login`, `/logout`, `/health`, `/static/*`, `/portal/*`); + `require_role("admin"|"superadmin")` guards specific routes. New routes are + protected automatically. +- **Lockout:** 5 fails → 15-min cooldown (doubling). +- **2FA:** deferred; TOTP later, admin/superadmin account first. +- **Safe rollout (no self-lockout):** ship behind a feature flag + `OPERATOR_AUTH_ENABLED` (default **off** = app behaves as today) → seed the first + `superadmin` via a small CLI (`backend/operator_admin.py`, modeled on + `portal_admin.py`) → log in while still open → flip the flag on → create + parents' accounts. Flag back off = instant escape hatch; break-glass = + re-run seed / `reset-password` CLI in the container. +- **`OperatorUser` is a brand-new table** → `create_all` builds it on startup; only + the seed step is required. + +## Deferred B — Full multi-tenancy + +- Per-client **rollup**: one login spanning all of a client's projects. +- Per-project **separation within a client** (true tenant isolation). +- **Individual client user accounts** (per-person, optional roles) replacing the + shared per-project password. +- Extend the portal to **all client-relevant data types** (beyond sound: + vibration, reports, etc.) — the long-term goal of "everything we can show a + client." +- All additive on the existing scoping seam — no teardown. + +## Security notes + +- Auth-gated from day one (even the shared password) — never wide-open like the + internal app currently is. +- Scoping enforced server-side; client-supplied ids always re-checked. +- Passwords argon2-hashed; link tokens unguessable + revocable; raw password shown + once. +- `SECRET_KEY` a real secret in prod; cookies `HttpOnly` + `SameSite=Lax` + + `Secure` (once on TLS). +- **Known risk:** the operator "Portal access" panel — and the whole internal app — + is unauthenticated until Deferred A. Mitigated for now by the `/portal/*` + path-lock on the public subdomain plus keeping the internal app off the public + internet. Tracked in the hardening backlog (CLIENT_PORTAL.md).