docs: portal-auth design spec (Phase 1 password gate; operator-auth + multi-tenant deferred)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,237 @@
|
|||||||
|
# Portal Authentication — Design & Build Plan
|
||||||
|
|
||||||
|
**Status:** in development (`feat/portal-auth`) · **Targets:** 0.14.x · **Date:** 2026-06-15
|
||||||
|
|
||||||
|
Supersedes the interim shareable magic-link described in
|
||||||
|
[CLIENT_PORTAL.md](../../CLIENT_PORTAL.md) with a real password gate.
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Give a client a **secure link + password** that opens a **read-only dashboard** —
|
||||||
|
live data plus access to historical data — for the machines commissioned on
|
||||||
|
**their project**. Nothing else: no device control, no editing, no internal pages.
|
||||||
|
|
||||||
|
This is the first real, internet-facing, client-credentialed surface in the
|
||||||
|
system.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
**Phase 1 (this spec — build now):** per-project, password-gated, read-only portal.
|
||||||
|
|
||||||
|
**Deferred (designed, not built — captured below so nothing is lost):**
|
||||||
|
- **Operator auth** — logins + roles for the *internal* app (you / parents).
|
||||||
|
Full design in [Deferred A](#deferred-a--operator-auth-designed-not-built).
|
||||||
|
- **Full multi-tenancy** — per-client rollups, per-project separation within a
|
||||||
|
client, individual client user accounts, and extending the portal to all
|
||||||
|
client-relevant data. [Deferred B](#deferred-b--full-multi-tenancy).
|
||||||
|
|
||||||
|
## Principles (the portal's standing charter)
|
||||||
|
|
||||||
|
1. **Read-only.** A client can look, never touch.
|
||||||
|
2. **Strictly scoped, server-side.** Never trust a project / location / unit id
|
||||||
|
from the request — always re-resolve ownership.
|
||||||
|
3. **Cache-first.** Portal live data comes from SLMM's cache (the same cached
|
||||||
|
reads the internal dashboard uses). A client can never make us hit the device.
|
||||||
|
4. **The gate is a swappable seam.** Everything routes through the scoping layer
|
||||||
|
the portal already has; auth is the thin thing in front of it.
|
||||||
|
|
||||||
|
## The model
|
||||||
|
|
||||||
|
- **Tenant unit = the project.** Each project is its own portal: one link, one
|
||||||
|
password, showing that project's commissioned machines.
|
||||||
|
- **Shared credential — "company / project-manager wide."** No individual client
|
||||||
|
accounts. Because access is read-only, one shared password per project is an
|
||||||
|
acceptable trade. (Per-person accounts are a Deferred-B item.)
|
||||||
|
- **The link identifies the project; the password authorizes.** A password alone
|
||||||
|
can't say *which* project — so the link carries an unguessable, revocable
|
||||||
|
per-project token, and the password is the shared secret gating it.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
Two layers, two subdomains (hosting target: office Synology NAS behind a UniFi
|
||||||
|
UXG Max; own domain `terra-mechanics.com`).
|
||||||
|
|
||||||
|
```
|
||||||
|
Internet
|
||||||
|
│
|
||||||
|
UniFi UXG Max ── Layer 1 (IT pro): firewall, IPS/IDS, GeoIP allow-list,
|
||||||
|
│ kill-switch rule, 443 only
|
||||||
|
Synology NAS ── DSM reverse proxy + Let's Encrypt wildcard TLS
|
||||||
|
│
|
||||||
|
├─ terra-view.terra-mechanics.com → internal app (operator auth = Deferred A)
|
||||||
|
└─ portal.terra-mechanics.com → LOCKED to /portal/* only, password gate
|
||||||
|
```
|
||||||
|
|
||||||
|
The portal subdomain is **restricted to `/portal/*` at the reverse proxy** — a
|
||||||
|
client on `portal.` physically cannot reach `/roster`, `/admin/*`, etc., even by
|
||||||
|
guessing URLs. This path-lock is a load-bearing control for as long as the
|
||||||
|
internal app remains unauthenticated (until Deferred A lands).
|
||||||
|
|
||||||
|
## Data model
|
||||||
|
|
||||||
|
Add three columns to **`Project`**:
|
||||||
|
|
||||||
|
| Column | Type | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `portal_enabled` | bool, default `false` | Is the portal open for this project. |
|
||||||
|
| `portal_password_hash` | text, nullable | argon2id hash of the shared password. Never plaintext. |
|
||||||
|
| `portal_link_token` | text, unique, nullable | Unguessable token in the secure link; identifies the project without exposing its raw id, and is revocable (regenerate → old link dies). |
|
||||||
|
|
||||||
|
**Reused unchanged:** the `Client → Project → MonitoringLocation →
|
||||||
|
UnitAssignment → unit` scoping chain and the existing read-only scoped data
|
||||||
|
routes (`resolve_client_location` + live / history / events).
|
||||||
|
|
||||||
|
**Migration:** `migrate_add_project_portal_auth.py` — an `ALTER TABLE` adding the
|
||||||
|
three columns to the existing (non-empty) `projects` table. Same pattern as
|
||||||
|
`migrate_add_client_portal.py`; `create_all` won't add columns to an existing
|
||||||
|
table.
|
||||||
|
|
||||||
|
## Auth flow
|
||||||
|
|
||||||
|
1. **Operator enables + shares.** On the project page, the operator turns the
|
||||||
|
portal on; the system generates a strong password + a `portal_link_token`; the
|
||||||
|
operator copies **link + password** to send the client.
|
||||||
|
2. **Client opens the link** `portal.terra-mechanics.com/portal/p/{link_token}` →
|
||||||
|
the project is resolved from the token → a **password prompt** renders.
|
||||||
|
3. **Client submits the password** → argon2-verified against
|
||||||
|
`portal_password_hash`. On success, a **signed session cookie scoped to that
|
||||||
|
project** is set (HMAC via the existing `SECRET_KEY` cookie machinery), and
|
||||||
|
they are redirected to the project dashboard.
|
||||||
|
4. **Subsequent requests** re-validate the cookie (signature + project still
|
||||||
|
`portal_enabled` + within cookie max-age) and serve the existing read-only
|
||||||
|
scoped data.
|
||||||
|
5. **Logout** clears the cookie. **Revoke** = disable the portal or regenerate the
|
||||||
|
token / password, which kills outstanding links and any session minted from
|
||||||
|
them on the next request.
|
||||||
|
|
||||||
|
**Lockout:** track failed attempts (per token + IP); after 5 failures refuse for
|
||||||
|
a 15-minute cooldown. Combined with the UniFi GeoIP/IPS edge, that's solid for a
|
||||||
|
read-only surface.
|
||||||
|
|
||||||
|
**Shared cookie machinery:** lift the portal's cookie sign/verify out of
|
||||||
|
`portal_auth.py` into a small shared `backend/auth_cookies.py` — one signer, so
|
||||||
|
the future operator auth (Deferred A) reuses it instead of copy-pasting crypto.
|
||||||
|
|
||||||
|
### Relationship to the existing portal code
|
||||||
|
|
||||||
|
The portal today is *client-scoped* (a `ClientAccessToken` magic-link → a cookie
|
||||||
|
covering all of a client's projects, with a `/portal` overview). Phase 1 makes the
|
||||||
|
entry point *project-scoped*:
|
||||||
|
|
||||||
|
- The **`/portal/p/{link_token}` + password** flow becomes the way in; the
|
||||||
|
interim client magic-link (`/portal/enter/{token}`, `/portal/open/*`,
|
||||||
|
`PORTAL_OPEN_LINKS`) is **retired** in its favor.
|
||||||
|
- The existing read-only views (`/portal/location/{id}`, live / history / events)
|
||||||
|
and the scoping helper are **reused as-is**, just resolved against the project in
|
||||||
|
the session cookie instead of the client.
|
||||||
|
- `Client` / `ClientAccessToken` rows are **left in place** (no destructive
|
||||||
|
migration) — they become the substrate for the Deferred-B per-client rollup.
|
||||||
|
|
||||||
|
## Operator "Portal access" panel
|
||||||
|
|
||||||
|
On the project detail page (internal app), a panel that:
|
||||||
|
- Toggles `portal_enabled`.
|
||||||
|
- **Regenerate password** → shows a freshly generated strong password **once** for
|
||||||
|
the operator to copy.
|
||||||
|
- **Copy link** → the `/portal/p/{token}` URL.
|
||||||
|
- **Revoke** → regenerate the token (old link dies) and/or disable the portal.
|
||||||
|
|
||||||
|
This is an operator action. Until operator auth lands (Deferred A), it sits behind
|
||||||
|
the same posture as the rest of the internal app — see Security notes.
|
||||||
|
|
||||||
|
## Error handling
|
||||||
|
|
||||||
|
- **Bad password** → generic "incorrect password" + increment fail count.
|
||||||
|
- **Unknown / disabled / revoked token** → generic "this portal link is no longer
|
||||||
|
active" page (no project-existence leak).
|
||||||
|
- **Locked out** → "too many attempts, try again in 15 minutes."
|
||||||
|
- **Expired / invalid cookie** → back to the password prompt.
|
||||||
|
- **Portal disabled after a session started** → next request bounced to the prompt.
|
||||||
|
|
||||||
|
## Rollout
|
||||||
|
|
||||||
|
1. Implement on `feat/portal-auth` → review → merge to `dev`.
|
||||||
|
2. **Migration** `migrate_add_project_portal_auth.py` on each DB (dev + prod), same
|
||||||
|
drill as the client-portal migration.
|
||||||
|
3. **`SECRET_KEY`** must be a real value in prod (already required for the existing
|
||||||
|
portal cookie; the password gate reuses it).
|
||||||
|
4. **Hosting:** DSM reverse proxy routes `portal.` → app, locked to `/portal/*`;
|
||||||
|
Let's Encrypt wildcard TLS; cookies `Secure` once on TLS. UXG Max GeoIP + IPS +
|
||||||
|
kill-switch handled by the IT pro.
|
||||||
|
5. Enable a real project's portal, set a password, and test the full
|
||||||
|
link → password → dashboard flow over HTTPS before sending a client.
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
- **Unit:** argon2 hash/verify; token resolution (valid / unknown / disabled);
|
||||||
|
lockout counter; cookie sign/verify + scope check; "disabled mid-session" bounce.
|
||||||
|
- **Scoping:** a session for project A cannot read project B's locations / history
|
||||||
|
/ events (404, no existence leak).
|
||||||
|
- **Manual smoke:** enable → copy link + password → open in a fresh browser →
|
||||||
|
wrong password (lockout) → right password → see live + history → logout.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Deferred A — Operator auth (designed, not built)
|
||||||
|
|
||||||
|
Logins + roles for the **internal** app (`terra-view.` subdomain). Closes the
|
||||||
|
"internal app is wide open" hole. Full design, ready to lift into its own spec:
|
||||||
|
|
||||||
|
- **Two layers:** UniFi UXG Max edge (IT-pro owned — firewall, IPS, GeoIP,
|
||||||
|
kill-switch, 443-only) + in-app auth (built by us). Internet-exposed with login
|
||||||
|
(no VPN — deliberately, to spare non-technical family members).
|
||||||
|
- **`OperatorUser` model:** `id, email (unique, lowercased), display_name,
|
||||||
|
password_hash (argon2id), role, active, created_at, last_login_at,
|
||||||
|
sessions_valid_from, failed_login_count, locked_until` (+ later `totp_secret`,
|
||||||
|
`totp_enabled`).
|
||||||
|
- **Role ladder:** `superadmin > admin > operator`.
|
||||||
|
- `superadmin` = you — everything + account management (create/disable users,
|
||||||
|
reset passwords, assign roles).
|
||||||
|
- `admin` = your parents (company owners) + you — full run of the app, no
|
||||||
|
operational restrictions.
|
||||||
|
- `operator` = **future** restricted tier for hires; the ladder accepts it with
|
||||||
|
no route changes.
|
||||||
|
- The only thing gated above plain `admin` in v1 is account management
|
||||||
|
(`superadmin`).
|
||||||
|
- **Sessions:** stateless signed cookie reusing `auth_cookies.py` + `SECRET_KEY`
|
||||||
|
(distinct cookie name from the portal). `sessions_valid_from` gives "log out
|
||||||
|
everywhere" / revoke-on-password-change with no session table.
|
||||||
|
- **Authorization:** one **deny-by-default middleware** gates the whole internal
|
||||||
|
app (exempt: `/login`, `/logout`, `/health`, `/static/*`, `/portal/*`);
|
||||||
|
`require_role("admin"|"superadmin")` guards specific routes. New routes are
|
||||||
|
protected automatically.
|
||||||
|
- **Lockout:** 5 fails → 15-min cooldown (doubling).
|
||||||
|
- **2FA:** deferred; TOTP later, admin/superadmin account first.
|
||||||
|
- **Safe rollout (no self-lockout):** ship behind a feature flag
|
||||||
|
`OPERATOR_AUTH_ENABLED` (default **off** = app behaves as today) → seed the first
|
||||||
|
`superadmin` via a small CLI (`backend/operator_admin.py`, modeled on
|
||||||
|
`portal_admin.py`) → log in while still open → flip the flag on → create
|
||||||
|
parents' accounts. Flag back off = instant escape hatch; break-glass =
|
||||||
|
re-run seed / `reset-password` CLI in the container.
|
||||||
|
- **`OperatorUser` is a brand-new table** → `create_all` builds it on startup; only
|
||||||
|
the seed step is required.
|
||||||
|
|
||||||
|
## Deferred B — Full multi-tenancy
|
||||||
|
|
||||||
|
- Per-client **rollup**: one login spanning all of a client's projects.
|
||||||
|
- Per-project **separation within a client** (true tenant isolation).
|
||||||
|
- **Individual client user accounts** (per-person, optional roles) replacing the
|
||||||
|
shared per-project password.
|
||||||
|
- Extend the portal to **all client-relevant data types** (beyond sound:
|
||||||
|
vibration, reports, etc.) — the long-term goal of "everything we can show a
|
||||||
|
client."
|
||||||
|
- All additive on the existing scoping seam — no teardown.
|
||||||
|
|
||||||
|
## Security notes
|
||||||
|
|
||||||
|
- Auth-gated from day one (even the shared password) — never wide-open like the
|
||||||
|
internal app currently is.
|
||||||
|
- Scoping enforced server-side; client-supplied ids always re-checked.
|
||||||
|
- Passwords argon2-hashed; link tokens unguessable + revocable; raw password shown
|
||||||
|
once.
|
||||||
|
- `SECRET_KEY` a real secret in prod; cookies `HttpOnly` + `SameSite=Lax` +
|
||||||
|
`Secure` (once on TLS).
|
||||||
|
- **Known risk:** the operator "Portal access" panel — and the whole internal app —
|
||||||
|
is unauthenticated until Deferred A. Mitigated for now by the `/portal/*`
|
||||||
|
path-lock on the public subdomain plus keeping the internal app off the public
|
||||||
|
internet. Tracked in the hardening backlog (CLIENT_PORTAL.md).
|
||||||
Reference in New Issue
Block a user