Files
terra-view/docs/CLIENT_PORTAL.md
T
serversdown 04cd6b9f24 docs(portal): security hardening backlog for the dedicated pass
Consolidates the deferred items (reverse proxy exposing only /portal/*, TLS,
SECRET_KEY, PORTAL_OPEN_LINKS off, M4 auth incl. the operator app + currently-
unauthenticated operator endpoints, and the smaller code-review items) into an
actionable checklist so the hardening session starts from a list, not a recall.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 02:39:33 +00:00

209 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Client Portal — Design & Build Plan
**Status:** in development (`feat/client-portal`) · **Targets:** 0.14.x
A client-facing, **read-only**, **scoped** view into a client's own monitoring
data. The first internet-facing-with-real-clients surface in the system. Built
*inside* the Terra-View app (new `/portal/*` namespace), reusing the cached SLMM
reads and Terra-View's report generation — Terra-View stays the UI/business layer;
SLMM stays the device layer.
## Principles
1. **Read-only.** No device control (start/stop/config), no roster editing, no
internal pages. A client can look, never touch.
2. **Strictly scoped.** A client only ever sees data for *their* projects. Every
portal endpoint verifies ownership server-side — never trust a `unit_id` /
`location_id` from the request.
3. **Cache-first, no device contention.** Portal live data comes from SLMM's
cache (the same cached `/status` + `/history` the internal dashboard uses).
No device-hitting calls from the portal — a client can't make us hammer the
NL-43. Freshness depends on **keepalive being on** for the client's units.
4. **Auth is a swappable gate.** Every route depends on one resolver,
`get_current_client()`. M1M3 ride on an interim signed "magic URL"; M4
replaces the resolver's backing without touching routes or templates.
## The data chain (how a client maps to live data)
```
Client.id
└─ Project (client_id == Client.id, status != deleted)
└─ MonitoringLocation (project_id, location_type == "sound", removed_at IS NULL)
└─ UnitAssignment (location_id, status == "active", device_type == "slm",
assigned_until IS NULL or future)
└─ unit_id == RosterUnit.id == SLMM unit_id
└─ SLMM cached /status + /history (read-only)
```
So the portal shows a client their **locations**, each surfacing the live sound
level from whatever SLM is currently assigned there.
## Data model (new)
```python
class Client(Base): # the customer org
id, name, slug (unique, URL-safe), contact_email (nullable, for M4),
active (bool), created_at
class ClientAccessToken(Base): # the interim "magic URL" gate
id, client_id, token_hash (sha256 raw shown once on creation),
label, created_at, last_used_at, revoked_at (nullable)
```
Plus a migration adding **`Project.client_id`** (nullable FK → `clients.id`).
The existing free-text `Project.client_name` stays for display/back-compat;
`client_id` is the authoritative link.
## Auth — the swappable gate
```python
def get_current_client(request, db) -> Client: # every /portal route depends on this
# M1M3: read signed `portal_client` cookie -> load Client
# M4: same signature, backed by real sessions (magic-link / password)
```
**Interim "magic URL" flow (M1M3):**
- Operator creates a `Client` + an access token → gets a one-time-display URL:
`https://…/portal/enter/{token}`.
- Client clicks it → token is hashed, looked up (must be un-revoked) →
sets a **signed session cookie** (`portal_client`, HMAC via a new `SECRET_KEY`
env) → redirects to `/portal`. `last_used_at` updated.
- `get_current_client` reads + verifies the cookie thereafter. No valid cookie →
"link invalid / expired" page.
- Revoke = set `revoked_at`; the link (and any cookie minted from it) stops working.
Unguessable + revocable + per-person, no email infra or passwords yet — and M4
slots in behind the same `get_current_client` with zero route/template churn.
## Routes (`/portal/*`)
| Route | Purpose |
|-------|---------|
| `GET /portal/enter/{token}` | validate token → set cookie → redirect to `/portal` |
| `GET /portal` | client's locations overview (status tiles + map) |
| `GET /portal/location/{id}` | read-only live panel for that location's SLM |
| `GET /portal/api/location/{id}/live` | **scoped** cached `/status` for the location's unit |
| `GET /portal/api/location/{id}/history` | **scoped** cached trail for the chart |
| `GET /portal/logout` | clear cookie |
**Scoping helper** (used by every data route):
`resolve_client_location(client, location_id, db) -> (location, unit_id)` — raises
403 if the location isn't in one of the client's projects. The portal never calls
the open `/api/slmm/{unit}/*` endpoints with a client-supplied id.
## Templates (`templates/portal/`)
- `portal/base.html` — minimal client-branded shell (no internal sidebar/nav).
- `portal/overview.html` — location tiles (live cards mini) + a locations map.
- `portal/location.html` — the read-only live panel: cards (Lp/Leq/Lmax/L1/L10),
L1/L10 chart, measuring + freshness badge. Reuses the cache-populate JS from the
internal panel, **stripped** of start/stop, config, and the device-hitting
refresh (cache + 15s auto-poll only).
---
## Milestones
### M1 — Live view only *(current)*
Interim magic-URL gate; a client sees their locations and per-location read-only
live data, all from cache.
- [ ] `Client` + `ClientAccessToken` models; `Project.client_id` migration.
- [ ] `SECRET_KEY` env + signed-cookie session helper.
- [ ] `get_current_client` dependency + `/portal/enter/{token}` + logout.
- [ ] Scoping helper `resolve_client_location`.
- [ ] `/portal` overview + `/portal/location/{id}` (read-only live panel).
- [ ] Scoped `/portal/api/location/{id}/live` + `/history`.
- [ ] Portal templates (base, overview, location).
- [ ] Minimal admin: create client + mint/revoke access link (small `/admin`
page or a script for now).
### M2 — Dashboard + alerts
- Richer client dashboard (multi-location at-a-glance, status rollup).
- **Live project map** — upgrade the overview's basic location pins into a real
project map: pins colored by measuring/level, popups showing each location's
current reading, centered/zoomed to the project. (M1 ships the plain pin map;
this makes it a live status map.)
- Surface each location's **threshold-alert status** (read-only) + an event/inbox
view. Leans on the SLMM alert engine + dispatch.
### Notes carried from M1
- Tile headline metric is **Leq** (energy-average, the sound-monitoring compliance
metric) — chosen over the twitchy instantaneous Lp. If clients ever want a
different headline (e.g. Lmax for peaks), make it a per-deployment setting.
### M3 — Reports
- Client-facing list + download of the daily baseline-comparison reports.
- Depends on the FTP report pipeline (`feat/ftp-report-pipeline`) landing and
being wired into the portal's scoped routes.
### M4 — Full auth system
- Replace the interim token behind `get_current_client` with a real auth design:
magic-link (passwordless email) and/or accounts, proper sessions, password
reset, and likely auth for the *internal* app too. Reverse-proxy + TLS posture.
## Going to prod (M1)
1. **Run the migration on the prod DB**`migrate_add_client_portal.py` adds
`projects.client_id` (the new tables auto-create via `create_all`). Skipping it
500s anything that touches `Project.client_id`. This is the silent killer.
```bash
docker compose exec web-app python3 backend/migrate_add_client_portal.py
```
2. **Set a real `SECRET_KEY`** in the prod env (compose). The portal signs session
cookies with it; the insecure dev default (it logs a warning at boot) is
forgeable. Non-negotiable for an internet-facing portal.
3. **SLMM_BASE_URL** — prod base compose already points at `:8100` (correct; the
`:9100` mismatch is a dev-only override quirk). For full live data (L1/L10 +
chart backfill) prod SLMM must be on the `dev` build with its migrations
(`migrate_add_ln_percentiles`, `migrate_add_monitor_enabled`) and **keepalive on**
for the client's units — otherwise the portal degrades gracefully (cards show
`--`, chart empty), it just isn't fully populated.
4. **Seed real clients** with the CLI (`backend/portal_admin.py`): `create-client`
→ `link-project` (a real sound project with an active SLM assignment) →
`mint-link` → send the client the printed URL (shown once).
5. **Exposure** — portal routes are auth-gated, but port 8001 still serves the
whole *internal* app with no auth. Before real clients are on it, the portal
should sit behind the reverse proxy with only `/portal/*` exposed (or the app
restricted). This is the point where the parked reverse-proxy/TLS work becomes
load-bearing.
## Security notes
- Portal is auth-gated from day one (even the interim gate) — never wide-open like
the internal app.
- All scoping enforced server-side; client-supplied ids are always re-checked.
- `SECRET_KEY` must be a real secret in prod (env, not committed).
- Cookies: `HttpOnly`, `SameSite=Lax`, `Secure` once behind TLS.
- Tokens stored hashed; raw shown once. Revocation is immediate.
## Security hardening backlog ("Fest 2026")
The to-do for the dedicated hardening pass, roughly highest-impact first. Until
then the portal runs on security-by-obscurity (open port + interim links) — fine
for a not-in-use demo, not for real clients.
**Exposure (the big one):** port 8001 serves the *entire operator app* (roster,
projects, `/admin/*`, device config, the SLMM proxy) with **zero auth**, so an
open port exposes far more than the read-only portal.
- [ ] Reverse proxy (NPM/Caddy/Nginx) in front, exposing **only `/portal/*`** to
the internet; keep the operator app reachable on the LAN only.
- [ ] TLS everywhere (Let's Encrypt). Then set portal cookies `Secure`.
- [ ] Don't port-forward the raw app; if a quick gate is wanted before M4, an
auth proxy (Authelia / Authentik) can front the portal without writing auth.
**Config musts:**
- [ ] Set a real `SECRET_KEY` env (signs session cookies; default is public).
- [ ] `PORTAL_OPEN_LINKS=false` in any internet-facing env (it defaults off now).
**M4 — real auth** (replaces the interim token behind `get_current_client`):
- [ ] Magic-link email and/or accounts; proper sessions + password reset.
- [ ] Authenticate the **operator** app too (it currently has none).
- [ ] Gate the operator-only endpoints that are presently unauthenticated:
`/projects/{id}/portal-preview`, `/projects/{id}/portal-link*`,
`/portal/open/*`.
**Smaller items from the pre-merge code review:**
- [ ] Keepalive isn't auto-turned-off when the last alert rule on a unit is
deleted (intentional "never auto-off"; revisit if it wastes cellular).
- [ ] Consider rate-limiting the scoped portal endpoints once public.