04cd6b9f24
Consolidates the deferred items (reverse proxy exposing only /portal/*, TLS, SECRET_KEY, PORTAL_OPEN_LINKS off, M4 auth incl. the operator app + currently- unauthenticated operator endpoints, and the smaller code-review items) into an actionable checklist so the hardening session starts from a list, not a recall. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
209 lines
10 KiB
Markdown
209 lines
10 KiB
Markdown
# Client Portal — Design & Build Plan
|
||
|
||
**Status:** in development (`feat/client-portal`) · **Targets:** 0.14.x
|
||
|
||
A client-facing, **read-only**, **scoped** view into a client's own monitoring
|
||
data. The first internet-facing-with-real-clients surface in the system. Built
|
||
*inside* the Terra-View app (new `/portal/*` namespace), reusing the cached SLMM
|
||
reads and Terra-View's report generation — Terra-View stays the UI/business layer;
|
||
SLMM stays the device layer.
|
||
|
||
## Principles
|
||
|
||
1. **Read-only.** No device control (start/stop/config), no roster editing, no
|
||
internal pages. A client can look, never touch.
|
||
2. **Strictly scoped.** A client only ever sees data for *their* projects. Every
|
||
portal endpoint verifies ownership server-side — never trust a `unit_id` /
|
||
`location_id` from the request.
|
||
3. **Cache-first, no device contention.** Portal live data comes from SLMM's
|
||
cache (the same cached `/status` + `/history` the internal dashboard uses).
|
||
No device-hitting calls from the portal — a client can't make us hammer the
|
||
NL-43. Freshness depends on **keepalive being on** for the client's units.
|
||
4. **Auth is a swappable gate.** Every route depends on one resolver,
|
||
`get_current_client()`. M1–M3 ride on an interim signed "magic URL"; M4
|
||
replaces the resolver's backing without touching routes or templates.
|
||
|
||
## The data chain (how a client maps to live data)
|
||
|
||
```
|
||
Client.id
|
||
└─ Project (client_id == Client.id, status != deleted)
|
||
└─ MonitoringLocation (project_id, location_type == "sound", removed_at IS NULL)
|
||
└─ UnitAssignment (location_id, status == "active", device_type == "slm",
|
||
assigned_until IS NULL or future)
|
||
└─ unit_id == RosterUnit.id == SLMM unit_id
|
||
└─ SLMM cached /status + /history (read-only)
|
||
```
|
||
|
||
So the portal shows a client their **locations**, each surfacing the live sound
|
||
level from whatever SLM is currently assigned there.
|
||
|
||
## Data model (new)
|
||
|
||
```python
|
||
class Client(Base): # the customer org
|
||
id, name, slug (unique, URL-safe), contact_email (nullable, for M4),
|
||
active (bool), created_at
|
||
|
||
class ClientAccessToken(Base): # the interim "magic URL" gate
|
||
id, client_id, token_hash (sha256 — raw shown once on creation),
|
||
label, created_at, last_used_at, revoked_at (nullable)
|
||
```
|
||
|
||
Plus a migration adding **`Project.client_id`** (nullable FK → `clients.id`).
|
||
The existing free-text `Project.client_name` stays for display/back-compat;
|
||
`client_id` is the authoritative link.
|
||
|
||
## Auth — the swappable gate
|
||
|
||
```python
|
||
def get_current_client(request, db) -> Client: # every /portal route depends on this
|
||
# M1–M3: read signed `portal_client` cookie -> load Client
|
||
# M4: same signature, backed by real sessions (magic-link / password)
|
||
```
|
||
|
||
**Interim "magic URL" flow (M1–M3):**
|
||
- Operator creates a `Client` + an access token → gets a one-time-display URL:
|
||
`https://…/portal/enter/{token}`.
|
||
- Client clicks it → token is hashed, looked up (must be un-revoked) →
|
||
sets a **signed session cookie** (`portal_client`, HMAC via a new `SECRET_KEY`
|
||
env) → redirects to `/portal`. `last_used_at` updated.
|
||
- `get_current_client` reads + verifies the cookie thereafter. No valid cookie →
|
||
"link invalid / expired" page.
|
||
- Revoke = set `revoked_at`; the link (and any cookie minted from it) stops working.
|
||
|
||
Unguessable + revocable + per-person, no email infra or passwords yet — and M4
|
||
slots in behind the same `get_current_client` with zero route/template churn.
|
||
|
||
## Routes (`/portal/*`)
|
||
|
||
| Route | Purpose |
|
||
|-------|---------|
|
||
| `GET /portal/enter/{token}` | validate token → set cookie → redirect to `/portal` |
|
||
| `GET /portal` | client's locations overview (status tiles + map) |
|
||
| `GET /portal/location/{id}` | read-only live panel for that location's SLM |
|
||
| `GET /portal/api/location/{id}/live` | **scoped** cached `/status` for the location's unit |
|
||
| `GET /portal/api/location/{id}/history` | **scoped** cached trail for the chart |
|
||
| `GET /portal/logout` | clear cookie |
|
||
|
||
**Scoping helper** (used by every data route):
|
||
`resolve_client_location(client, location_id, db) -> (location, unit_id)` — raises
|
||
403 if the location isn't in one of the client's projects. The portal never calls
|
||
the open `/api/slmm/{unit}/*` endpoints with a client-supplied id.
|
||
|
||
## Templates (`templates/portal/`)
|
||
|
||
- `portal/base.html` — minimal client-branded shell (no internal sidebar/nav).
|
||
- `portal/overview.html` — location tiles (live cards mini) + a locations map.
|
||
- `portal/location.html` — the read-only live panel: cards (Lp/Leq/Lmax/L1/L10),
|
||
L1/L10 chart, measuring + freshness badge. Reuses the cache-populate JS from the
|
||
internal panel, **stripped** of start/stop, config, and the device-hitting
|
||
refresh (cache + 15s auto-poll only).
|
||
|
||
---
|
||
|
||
## Milestones
|
||
|
||
### M1 — Live view only *(current)*
|
||
Interim magic-URL gate; a client sees their locations and per-location read-only
|
||
live data, all from cache.
|
||
- [ ] `Client` + `ClientAccessToken` models; `Project.client_id` migration.
|
||
- [ ] `SECRET_KEY` env + signed-cookie session helper.
|
||
- [ ] `get_current_client` dependency + `/portal/enter/{token}` + logout.
|
||
- [ ] Scoping helper `resolve_client_location`.
|
||
- [ ] `/portal` overview + `/portal/location/{id}` (read-only live panel).
|
||
- [ ] Scoped `/portal/api/location/{id}/live` + `/history`.
|
||
- [ ] Portal templates (base, overview, location).
|
||
- [ ] Minimal admin: create client + mint/revoke access link (small `/admin`
|
||
page or a script for now).
|
||
|
||
### M2 — Dashboard + alerts
|
||
- Richer client dashboard (multi-location at-a-glance, status rollup).
|
||
- **Live project map** — upgrade the overview's basic location pins into a real
|
||
project map: pins colored by measuring/level, popups showing each location's
|
||
current reading, centered/zoomed to the project. (M1 ships the plain pin map;
|
||
this makes it a live status map.)
|
||
- Surface each location's **threshold-alert status** (read-only) + an event/inbox
|
||
view. Leans on the SLMM alert engine + dispatch.
|
||
|
||
### Notes carried from M1
|
||
- Tile headline metric is **Leq** (energy-average, the sound-monitoring compliance
|
||
metric) — chosen over the twitchy instantaneous Lp. If clients ever want a
|
||
different headline (e.g. Lmax for peaks), make it a per-deployment setting.
|
||
|
||
### M3 — Reports
|
||
- Client-facing list + download of the daily baseline-comparison reports.
|
||
- Depends on the FTP report pipeline (`feat/ftp-report-pipeline`) landing and
|
||
being wired into the portal's scoped routes.
|
||
|
||
### M4 — Full auth system
|
||
- Replace the interim token behind `get_current_client` with a real auth design:
|
||
magic-link (passwordless email) and/or accounts, proper sessions, password
|
||
reset, and likely auth for the *internal* app too. Reverse-proxy + TLS posture.
|
||
|
||
## Going to prod (M1)
|
||
|
||
1. **Run the migration on the prod DB** — `migrate_add_client_portal.py` adds
|
||
`projects.client_id` (the new tables auto-create via `create_all`). Skipping it
|
||
500s anything that touches `Project.client_id`. This is the silent killer.
|
||
```bash
|
||
docker compose exec web-app python3 backend/migrate_add_client_portal.py
|
||
```
|
||
2. **Set a real `SECRET_KEY`** in the prod env (compose). The portal signs session
|
||
cookies with it; the insecure dev default (it logs a warning at boot) is
|
||
forgeable. Non-negotiable for an internet-facing portal.
|
||
3. **SLMM_BASE_URL** — prod base compose already points at `:8100` (correct; the
|
||
`:9100` mismatch is a dev-only override quirk). For full live data (L1/L10 +
|
||
chart backfill) prod SLMM must be on the `dev` build with its migrations
|
||
(`migrate_add_ln_percentiles`, `migrate_add_monitor_enabled`) and **keepalive on**
|
||
for the client's units — otherwise the portal degrades gracefully (cards show
|
||
`--`, chart empty), it just isn't fully populated.
|
||
4. **Seed real clients** with the CLI (`backend/portal_admin.py`): `create-client`
|
||
→ `link-project` (a real sound project with an active SLM assignment) →
|
||
`mint-link` → send the client the printed URL (shown once).
|
||
5. **Exposure** — portal routes are auth-gated, but port 8001 still serves the
|
||
whole *internal* app with no auth. Before real clients are on it, the portal
|
||
should sit behind the reverse proxy with only `/portal/*` exposed (or the app
|
||
restricted). This is the point where the parked reverse-proxy/TLS work becomes
|
||
load-bearing.
|
||
|
||
## Security notes
|
||
|
||
- Portal is auth-gated from day one (even the interim gate) — never wide-open like
|
||
the internal app.
|
||
- All scoping enforced server-side; client-supplied ids are always re-checked.
|
||
- `SECRET_KEY` must be a real secret in prod (env, not committed).
|
||
- Cookies: `HttpOnly`, `SameSite=Lax`, `Secure` once behind TLS.
|
||
- Tokens stored hashed; raw shown once. Revocation is immediate.
|
||
|
||
## Security hardening backlog ("Fest 2026")
|
||
|
||
The to-do for the dedicated hardening pass, roughly highest-impact first. Until
|
||
then the portal runs on security-by-obscurity (open port + interim links) — fine
|
||
for a not-in-use demo, not for real clients.
|
||
|
||
**Exposure (the big one):** port 8001 serves the *entire operator app* (roster,
|
||
projects, `/admin/*`, device config, the SLMM proxy) with **zero auth**, so an
|
||
open port exposes far more than the read-only portal.
|
||
- [ ] Reverse proxy (NPM/Caddy/Nginx) in front, exposing **only `/portal/*`** to
|
||
the internet; keep the operator app reachable on the LAN only.
|
||
- [ ] TLS everywhere (Let's Encrypt). Then set portal cookies `Secure`.
|
||
- [ ] Don't port-forward the raw app; if a quick gate is wanted before M4, an
|
||
auth proxy (Authelia / Authentik) can front the portal without writing auth.
|
||
|
||
**Config musts:**
|
||
- [ ] Set a real `SECRET_KEY` env (signs session cookies; default is public).
|
||
- [ ] `PORTAL_OPEN_LINKS=false` in any internet-facing env (it defaults off now).
|
||
|
||
**M4 — real auth** (replaces the interim token behind `get_current_client`):
|
||
- [ ] Magic-link email and/or accounts; proper sessions + password reset.
|
||
- [ ] Authenticate the **operator** app too (it currently has none).
|
||
- [ ] Gate the operator-only endpoints that are presently unauthenticated:
|
||
`/projects/{id}/portal-preview`, `/projects/{id}/portal-link*`,
|
||
`/portal/open/*`.
|
||
|
||
**Smaller items from the pre-merge code review:**
|
||
- [ ] Keepalive isn't auto-turned-off when the last alert rule on a unit is
|
||
deleted (intentional "never auto-off"; revisit if it wastes cellular).
|
||
- [ ] Consider rate-limiting the scoped portal endpoints once public.
|