Commit Graph

308 Commits

Author SHA1 Message Date
serversdown 01180d5725 fix: retire portal_admin mint-link (dead /portal/enter URL); refresh docstrings; assert revoke route gone 2026-06-16 00:15:09 +00:00
serversdown f0a13ea2ff refactor: retire interim magic-link/open-link in favor of password gate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-16 00:06:02 +00:00
serversdown 0394f4b0c8 fix: error handling + robust state in Portal access panel JS (per review) 2026-06-16 00:02:33 +00:00
serversdown eb91441904 feat: operator Portal access panel (enable + password + link)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:59:41 +00:00
serversdown 25a4a28433 feat: operator portal-access endpoints (enable/password/disable/state)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:55:10 +00:00
serversdown b8e4718318 fix: link project to its portal client (project.client_id) so the portal isn't empty
Caught by adversarial review of the scope test: portal_client_for_project minted a
dedicated client but never set project.client_id, so the client-scoped routes found
no projects — every location 404'd, including the client's own (empty portal). Now
links the project + adds a positive-case test.
2026-06-15 23:53:19 +00:00
serversdown c3eb900b7e test: portal session is isolated to its own project (404 on others)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:48:08 +00:00
serversdown c74dada8b3 fix: treat enabled-but-passwordless portal as inactive (no dead form / self-lockout) 2026-06-15 23:46:14 +00:00
serversdown d75f405857 feat: per-project portal password gate (/portal/p/{token}) + lockout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:41:37 +00:00
serversdown 446d8704f9 refactor: hoist Project import to top; drop unused test import 2026-06-15 23:39:14 +00:00
serversdown c04830a0ad feat: per-project portal session mint + link-token resolve + lockout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:35:48 +00:00
serversdown b11e1a554f feat: add per-project portal gate columns + migration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:32:41 +00:00
serversdown ad6de946b5 refactor: simplify verify_password except clause; drop unused import 2026-06-15 23:31:14 +00:00
serversdown d44625374d feat: argon2 password hashing helpers for the portal 2026-06-15 23:29:26 +00:00
serversdown 33069a070d test: tidy conftest fixtures per review (drop dead try/finally, scope override cleanup, rm unused import) 2026-06-15 23:28:16 +00:00
serversdown ec5d986ac5 test: stand up pytest harness + add argon2-cffi
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 23:23:41 +00:00
serversdown 0888da32b4 docs: portal-auth Phase 1 implementation plan
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 21:11:33 +00:00
serversdown 485e3f165b docs: portal-auth design spec (Phase 1 password gate; operator-auth + multi-tenant deferred)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 18:27:40 +00:00
serversdown 5f02a0bc21 Merge client portal into dev
Reviewed-on: #61
2026-06-11 23:21:52 -04:00
serversdown 684a487203 docs: changelog [Unreleased] — add the client portal feature
Documents the read-only client portal under [Unreleased] alongside the SLM
live-monitoring work: per-client scoping + interim auth, live location view with
the auto-closing WS stream, locations overview map + rollup, the alerts
config→surface→24/7 track, operator sharing tools, the field-instrument design +
light/dark toggle, the security posture, and upgrade notes (migration, SECRET_KEY,
SLMM alert-engine pairing).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 03:20:16 +00:00
serversdown 04cd6b9f24 docs(portal): security hardening backlog for the dedicated pass
Consolidates the deferred items (reverse proxy exposing only /portal/*, TLS,
SECRET_KEY, PORTAL_OPEN_LINKS off, M4 auth incl. the operator app + currently-
unauthenticated operator endpoints, and the smaller code-review items) into an
actionable checklist so the hardening session starts from a list, not a recall.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 02:39:33 +00:00
serversdown fe7cf91488 fix(portal): pre-merge security hardening from code review
- PORTAL_OPEN_LINKS now defaults OFF — /portal/open/* is an unauthenticated,
  proxy-reachable session-minting path (and a linked project's open link grants
  the whole client's scope), so it must be explicitly enabled in dev.
- Session cookie: enforce server-side expiry (check iat vs COOKIE_MAX_AGE — was
  browser-only) and guard a non-dict signed body (was an uncaught AttributeError →
  500, reachable if SECRET_KEY is the insecure default).
- Escape operator-set strings (location/rule/event names) before innerHTML +
  Leaflet tooltips — they're client-facing, so a name with markup was stored XSS
  in the client's browser. Global esc() helper applied at every injection point.
- WS _scrub_frame drops a non-JSON frame instead of forwarding it raw; /history
  rows now whitelisted like the other scoped endpoints.
- Preview-client slug uses the full project id (an 8-char prefix could collide
  two projects onto one client).

Verified: cookie reader (fresh/expired/non-dict/missing-iat) + open-links default
off; templates parse; scoped scrubbing intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 23:40:52 +00:00
serversdown c1bc391ba2 feat(portal): show the client their active alert limits
New scoped GET /portal/api/location/{id}/thresholds returns the enabled alert
rules (scrubbed: name/metric/comparison/threshold/duration/schedule — no cooldown
or hysteresis internals). Location page renders an "Alert limits" panel above the
history, e.g. "Night noise · Leq above 65 dB for 60s · 22:00–07:00", hidden when
no limits are set. Gives the breach history context.

Verified: portal.py compiles; location script balances; template parses.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 23:28:57 +00:00
serversdown a81764d4bc style(portal): cool light background, keep the solid cards
Reverts the light-mode ground to a cool light (#eef2f9) with cool navy ink,
borders, and shadow — keeping the solid (opaque, defined) cards from the
un-ghosting pass so it's clean rather than dull. theme-color meta updated to match.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:46:04 +00:00
serversdown a555cb74dd style(portal): default to light theme
Light is now the default for new visitors/clients (was dark); the toggle still
flips to dark and the choice persists. Also fixed the mobile theme-color meta to
update the actual <meta> tag (was setting a no-op attribute on <html>) and use the
warm paper shade.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:42:55 +00:00
serversdown 505c2e3ca5 style(portal): warm, solid light mode — paper bg + defined cards
Light mode was washed out. Switch the background to warm paper (#f7f5ef), make
panels solid white (no longer translucent/ghostly) with a warm hairline border
and a grounded two-layer shadow, and warm the text ink. Light-specific hover
shadow (the dark one is invisible on paper). Also fix two dark-only reds — the
alarm banner and active-event text now use var(--lvl-bad) so they read on both
themes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:34:31 +00:00
serversdown f760e81309 style(portal): live-console location page, polished access splash, light/dark toggle
- Location page rebuilt as a monitoring console: Leq hero readout (mono, level-
  colored, auto-flips with theme), instrument strip for Lp/Lmax/L1/L10, refined
  dark Chart.js (mono ticks, thin lines), panel-styled alert history, polished
  pause overlay. All live-stream/chart/alert JS hooks preserved.
- Access page → centered branded splash.
- Light/Dark toggle: CSS-variable theme system (structure + level/metric accent
  colors flip), header sun/moon button, localStorage + no-flash boot script,
  smooth body transition. On toggle, a 'portal-theme' event re-skins the Chart.js
  trace and swaps Leaflet tiles (CARTO dark <-> light) + recolors map dots.

All JS hook IDs intact (verified); both themes validated to parse + balance.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 20:10:29 +00:00
serversdown 4839d14a22 style(portal): field-instrument redesign — shell + overview
A refined dark "field instrument" aesthetic for the client-facing portal:
- Type: Hanken Grotesk UI + IBM Plex Mono for readings (dB values feel like real
  instrumentation). Tabular numerals.
- Atmosphere: deep navy-black base with a navy/burgundy aurora and a faint fixed
  instrument grid; sticky blurred header with an animated signal-bars mark.
- Panel system (.panel/.panel-hover): translucent, hairline-lit, depth + hover
  lift. Pulsing live dot; staggered load reveal.
- Overview: mono Leq hero on each tile (colored by level when live), pill badges
  with the pulsing dot, rollup pills, dark CARTO map tiles, level-colored dots.

All live-data JS hook IDs preserved (verified). No backend change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 19:41:48 +00:00
serversdown fa7dc39e5e feat(portal): M2b-3 note — enabled alerts keep the device monitored 24/7
UI note on the SLM alerts card reflecting the SLMM keepalive coupling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 19:36:16 +00:00
serversdown 0914cf0a75 feat(portal): M2b-2 — surface alert state + breach history (internal + portal)
Internal (SLM detail page): live alarm-state badge in the Alerts header
(● N active / ✓ all clear), a History list of fired events (onset → clear, peak
dB, ack status) with an Ack button, refreshed every 20s. Reads the existing SLMM
/alerts/events + /ack via the proxy.

Portal (client, read-only, scoped): new GET /portal/api/location/{id}/events —
ownership-gated, returns a scrubbed projection (rule_name/metric/threshold/onset/
peak/clear/status only; no internal ids or ack-by) plus an `active` count. The
location page shows a red "Currently above threshold" banner when active and a
read-only breach history, polled every 20s. No ack on the client side.

Verified: portal.py compiles; both scripts balance; both templates parse.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 19:02:56 +00:00
serversdown 29b974a1f7 feat(portal): M2b-1 — alert rule config UI on the SLM detail page
Adds an "Alerts" card to /slm/{id}: lists rules and a create/edit/delete form
(simple-first — "Alert when [Leq] is [above] [65] dB for [N] s", optional
time-of-day window + day picker, advanced hysteresis/cooldown collapsed). Talks
to the existing SLMM alert CRUD via the proxy (/api/slmm/{unit}/alerts/rules);
no SLMM changes. Rule changes invalidate the evaluator's cache server-side.

Verified: alerts script JS balances, slm_detail.html parses, and the TV proxy
forwards method + JSON body + query params for POST/PUT/DELETE.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 18:01:27 +00:00
serversdown bececafe78 feat(portal): plain no-token "open" links for dev feedback (PORTAL_OPEN_LINKS)
Adds a frictionless shareable link so anyone can open a project's client portal
during dev without minting/copying a magic token. GET /portal/open/{project_id}
(gated by PORTAL_OPEN_LINKS) provisions the client session and lands on /portal;
lives under /portal so it works through a proxy exposing only /portal/*.

The project page's "Copy client link" modal now leads with this Quick share link
(amber, host taken from window.location.origin so it always matches the host you
copied it from — no more LAN-vs-public foot-gun). The token-based generate/list/
revoke stays below for the eventual secure path.

PORTAL_OPEN_LINKS defaults ON for the prototype (whole app is open anyway) and logs
a warning; set =false before real clients. The get_current_client seam is
untouched, so M4 auth still layers in front of the same routes regardless.

Verified: compiles, share script balances, detail.html parses, flag default
on / =false off.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 17:26:37 +00:00
serversdown 2da9493cb5 feat(portal): "Copy client link" — generate/copy/revoke shareable links from the project page
No-CLI way to get a real shareable magic link (/portal/enter/<token>) for a
project's client. Project page gets a "Copy client link" button next to the
preview; opens a modal that lists active links (with revoke), generates a fresh
one, and copies it to the clipboard.

Backend (operator, internal /projects/*):
- POST /projects/{id}/portal-link  -> mint a fresh token, return the full URL
  (built from request.base_url so it uses the operator's host).
- GET  /projects/{id}/portal-links -> list active links (label/created/last-used).
- POST /projects/{id}/portal-link/{tid}/revoke -> revoke one (scoped to the
  project's client).

Refactor: split ensure_project_client() + mint_link_token() out of
provision_preview_session() so minting a shareable link and the preview cookie
share one provisioning path.

Verified: ensure/mint persistence across commits + sessions, minted link resolves,
token stored hashed, second mint = distinct active link (4/4); compiles; share
script balances; detail.html parses.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 17:11:34 +00:00
serversdown b908f394ed feat(portal): M2a — live status map + status rollup on the overview
Reuses the existing per-location /live fetch (no backend change):
- Map dots recolor live by current level (green/amber/red bands, grey when
  not measuring/offline) and the tooltip shows the live Leq. Bands are
  placeholders until M2 alert thresholds drive the color.
- Status rollup header: total locations, # live vs offline, and a "Loudest now"
  Leq callout. Aggregated each 15s refresh.

Refactored the refresh into refreshAll() (Promise.all over loadTile -> updateRollup);
loadTile now also feeds liveState + recolors the matching map dot.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 06:05:09 +00:00
serversdown 5455d3a931 style(portal): overview map uses dot markers, matching the internal project map
Swap Leaflet's default teardrop pins for L.circleMarker (radius 8, seismo-orange
fill, white border) + a name tooltip, same as partials/projects/location_map.html.
Also disables scroll-wheel zoom to match.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 03:40:17 +00:00
serversdown b971d19068 feat(portal): tile headline is Leq, not Lp; note live-map for M2
Lp (instantaneous) twitches every reading and makes a poor at-a-glance headline;
Leq (energy-average) is the stable, standard sound-monitoring/compliance metric.
Overview tiles now lead with Leq. Design doc: live project map (status-colored
pins + current-reading popups) recorded as an M2 item; headline-metric rationale
noted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 03:33:42 +00:00
serversdown 0103917870 feat(portal): live ~1Hz WS stream with auto-close (visibility + idle cap)
The portal location view is now genuinely live, not a 15s poll. Scoped WS endpoint
/portal/api/location/{id}/stream: authenticates via the session cookie, enforces
ownership (resolve_client_location), then bridges the unit's shared SLMM /monitor
fan-out feed to the browser — a viewer is just one more subscriber, no extra
device connection. Frames are scrubbed to the portal whitelist (drops unit_id,
raw_payload, counter, lmin) before reaching the client.

location.html: cache prefill for instant first paint, then upgrades to the live
socket (cards tick ~1Hz, chart scrolls). Auto-close so an abandoned tab can't pin
the device at 1Hz polling (~8x cellular data):
- closes when the tab is hidden, reopens when visible (Page Visibility) — the main
  guard;
- hard 15-min cap -> "Live paused — click to resume" overlay.

Refactor: client_from_cookie() extracted from get_current_client so the WS handler
(no Request-based Depends) can auth the same way.

Verified: scrub drops internal fields / keeps metrics + heartbeat (7/7), auth
refactor (3/3), portal compiles, location.html JS balances + parses.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 03:16:32 +00:00
serversdown 3fc20e104a feat(portal): one-click "View client portal" preview from the project page
Adds a "View client portal" button on the project detail page that opens the
client portal scoped to that project — no CLI. GET /projects/{id}/portal-preview
auto-provisions a client + access token for the project (provision_preview_session)
and seals a portal session cookie, then redirects to /portal.

- Reuses the project's linked client if it has one; otherwise creates/reuses a
  per-project 'preview-<id>' client. Only sets project.client_id when unset, so it
  never clobbers a real client link. Idempotent — repeat clicks reuse the same
  client/token.
- Lives under /projects (not /portal), so a future public proxy exposing only
  /portal/* won't expose this operator shortcut.

Verified: provisioning (unlinked creates+links, idempotent, linked-no-clobber) 7/7.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 02:18:06 +00:00
serversdown 2031681d0f docs(portal): add "Going to prod" checklist (migration, SECRET_KEY, exposure)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 02:01:58 +00:00
serversdown 1cf80ea7ea fix(portal): portal_admin.py runnable as a script, not just -m
`python3 backend/portal_admin.py` set sys.path[0] to backend/, hiding the
`backend` package and breaking `from backend.database import ...`. Insert the
project root on sys.path so the documented script invocation works.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 01:16:30 +00:00
serversdown 26b4b1e7e4 feat(portal): M1 admin CLI — create client, link projects, mint/revoke links
backend/portal_admin.py (run in-container): create-client, link-project (by id/
number/name -> sets Project.client_id), mint-link (prints the full magic URL once,
stores only the hash), list, revoke. PORTAL_BASE_URL controls the printed link base.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:43:28 +00:00
serversdown d3e221b6b1 feat(portal): M1 pages — locations overview + read-only live location view
/portal overview: client's active sound locations as live tiles (current Lp +
Live/Stopped badge + "updated Xm ago", polled from the scoped cache every 15s)
plus a Leaflet map of locations with coordinates. /portal/location/{id}: 404-gated
read-only live panel — Lp/Leq/Lmax/L1/L10 cards + a 4-line Chart.js trace
(backfilled from /history) + measuring/freshness badge. Cache-only, 15s poll, no
device controls, no refresh-from-device. _client_locations() feeds the overview.

Verified: portal.py compiles; both inline scripts balance; all four portal
templates parse in Jinja2.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:41:13 +00:00
serversdown 9f40210057 feat(portal): M1 scoping gate + scoped cache endpoints
resolve_client_location() enforces ownership (sound location in one of the
client's active projects) and 404s everything else — same response for missing
and not-yours, so location existence never leaks. active_unit_for_location()
resolves the currently-assigned SLM.

Scoped GET /portal/api/location/{id}/live and /history: gate -> resolve unit ->
read SLMM cache (never the device). /live returns a SCRUBBED projection (sound
metrics + run state only; no battery/SD/raw_payload). Both degrade gracefully
when there's no device or SLMM is down.

Verified: ownership gate (owns / other-client / vibration / deleted-project /
removed / nonexistent) + active-vs-completed unit resolution — 8/8 on a temp DB.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:38:16 +00:00
serversdown 6c048a9c30 feat(portal): M1 auth gate — signed magic-URL session + get_current_client
backend/portal_auth.py: stdlib HMAC-signed session cookie carrying the access-
token id (re-validated against the DB each request, so revoke kills live
sessions), hash_token, resolve_token, and the get_current_client dependency
(raises PortalAuthError). SECRET_KEY env (insecure dev default + warning).

routers/portal.py: /portal/enter/{token} mints the cookie -> /portal; /logout;
/access; /portal home stub. main.py registers the router + a PortalAuthError
handler (HTML access page for pages, 401 JSON for /portal/api/*).

Portal shell templates (base, access_required, overview stub), branded dark.

Verified: cookie round-trip + tamper/garbage rejection, token resolution
(valid/bad), get_current_client (valid/no-cookie/revoked) — 8/8 against a temp DB.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:36:09 +00:00
serversdown 80a8470b55 feat(portal): M1 data model — Client, ClientAccessToken, Project.client_id
Client (customer org), ClientAccessToken (interim hashed magic-URL gate), and an
authoritative Project.client_id FK (client_name kept for display). New tables
auto-create via create_all; migrate_add_client_portal.py adds projects.client_id.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:32:09 +00:00
serversdown a64c9ced65 docs: client portal design + milestone plan (M1 live view → M4 full auth)
Read-only, client-scoped portal inside Terra-View (/portal/*), reusing cached
SLMM reads. Data chain Client -> Project.client_id -> MonitoringLocation ->
active UnitAssignment -> unit_id -> SLMM cache. Auth is a swappable
get_current_client gate; M1-M3 ride an interim signed "magic URL", M4 replaces
the backing. Milestones: M1 live view, M2 dashboard+alerts, M3 reports, M4 auth.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 21:29:14 +00:00
serversdown 182e224f3c Merge pull request 'Feat: add SLM live monitoring improvements' (#60) from feat/slm-live-monitor into dev
Reviewed-on: #60

## [Unreleased]

SLM live monitoring — fan-out feed + cache-first reads.  Targets **0.14.0**.  The throughline: the NL-43 allows exactly **one** TCP connection at a time, so every page that opened its own device stream (or sent its own `Measure?`/DOD on load) was competing for that single connection — a second viewer saw nothing, and dashboard loads stole polling resolution from the live feed.  This release moves Terra-View entirely onto SLMM's shared, cached monitoring: one DOD poll loop per device, fanned out to all viewers; dashboards read SLMM's cache (a DB read on SLMM's side) instead of touching the device; and the live panels populate instantly from cache on open, upgrading to the live WS only on demand.  Paired with the SLMM-side work (adaptive poll rate, unreachable backoff, device-offline alert) on SLMM branch `dev`.

### Added

- **Fan-out `/monitor` feed consumption.**  The unit live view (`partials/slm_live_view.html`) and the dashboard live tile (`sound_level_meters.html`) now subscribe to SLMM's shared per-device monitor over `WS /api/slmm/{unit}/monitor` instead of each opening its own device stream.  Any number of clients attach without each consuming the NL-43's single connection — the "second viewer sees nothing" contention is gone.  A WS proxy handler for `/monitor` was added to `backend/routers/slmm.py`.
- **L1/L10 percentile lines + cards.**  Both the per-unit live chart and the dashboard card chart now plot L1 (purple) and L10 (orange) alongside Lp/Leq, and the KPI cards show L1/L10.  Sourced from the DOD feed's `ln1`/`ln2` (DRD streaming can't carry percentiles, DOD can).  Missing/`-.-` values leave a gap rather than dropping the line to 0.
- **Live-chart backfill on open.**  Charts seed from SLMM's downsampled DOD trail (`GET /api/slmm/{unit}/history?hours=2`) so a viewer sees recent trend immediately instead of a blank chart that fills one point per second.
- **Live Measurements panel auto-populates from cache.**  Opening the dashboard panel fills the KPI cards from cached `/status` and backfills the chart from `/history` — pure cache reads, no device hit.  Shows a measuring badge (● Measuring / ■ Stopped) and a freshness stamp ("as of 3:48 PM (10s ago)", amber + "cached" when stale).  Re-polls the cache every 15s while open; **Start Live Stream** upgrades to the live WS and no longer wipes the backfilled trail (chart point cap raised 60 → 600).
- **Refresh buttons** — one per device-list row, one in the panel header.  On-demand, user-initiated single device read via `GET /api/slmm/{unit}/live` (which also refreshes SLMM's cache), with a spinner + success/error toast, then reloads the device list.
- **Per-unit live-monitoring (keepalive) toggle on `/admin/slmm`** — turns a device's server-side keepalive feed on/off (`POST /monitor/start|stop`), so alerting can keep a device's feed running with no browser attached.

### Changed

- **Dashboard device list + command center read SLMM's cache, not the device.**  `slm_dashboard.py`'s `get_slm_units` pulls each unit's cached status from SLMM's `/roster` (one call, a SLMM DB read) for the badge + freshness; the command-center `get_live_view` reads cached `/status` instead of sending `Measure?` + a fresh DOD on every load.  This stops dashboard loads from stealing the device's single connection from the live monitor.  The elapsed-measurement timer still works because `measurement_start_time` is now included in the cached `/status` response.
- **Device-list freshness reflects real monitoring.**  The "Last check" line now uses SLMM's cached `last_seen` (which the monitor advances on every successful poll) via `unit.cache_last_seen`, instead of the `slm_last_check` roster field the monitor never updates.  The status badge also treats `Measure` as Measuring, matching the panel and SLMM's cache.
- **Status badge relocated** to the card's bottom meta row (next to "Last check"), off the top-right corner where it collided with the chart/gear/refresh action icons.

### Fixed

- **Deploy/bench threw `can't access property "dispatchEvent", e is null`.**  `toggleSLMDeployed()` and the save-config path called `htmx.trigger('#slm-list', 'load')` guarded only by `typeof htmx !== 'undefined'`; no page has a `#slm-list`, so htmx resolved null and called `null.dispatchEvent(...)`.  The deploy POST had already succeeded, so the operator saw both the green success **and** a red error.  Both call sites now guard on the element existing (`slm_settings_modal.html`).
- **Monitor WS proxy leaked `CancelledError` / "task exception never retrieved"** on stream stop — the cleanup awaited pending tasks but only caught `Exception`, missing `CancelledError` (a `BaseException`).
- **"No recent check-in" shown even on an actively-monitored device** — the row read the stale `slm_last_check` roster field instead of SLMM's live cache (see Changed).
- **L1/L10 KPI cards populated but the chart drew no L1/L10 lines** — the card chart only had Lp + Leq datasets.

### Upgrade Notes

Requires the **matching SLMM build (branch `dev`)** — Terra-View now depends on SLMM's fan-out `/monitor` feed, `/history` trail, `/status` carrying `ln1`/`ln2` + `measurement_start_time`, cached `/roster` status, and the `monitor_enabled` keepalive flag.

```bash
# SLMM (branch dev) — REBUILD + MIGRATE (or you'll get `no such column: nl43_status.ln1` 500s)
cd /home/serversdown/slmm && docker compose build slmm && docker compose up -d slmm
docker exec terra-view-slmm-1 python3 migrate_add_ln_percentiles.py
docker exec terra-view-slmm-1 python3 migrate_add_monitor_enabled.py

# Terra-View — NO migration; templates are baked into the image, so rebuild (don't just restart)
cd /home/serversdown/terra-view && docker compose build terra-view && docker compose up -d terra-view
```

The two builds must ship **together**.  Note the `docker-compose.yml` container was renamed for clarity (now `terra-view-terra-view-1`) — adjust any `docker exec` scripts that referenced the old name.

---
2026-06-10 16:33:25 -04:00
serversdown 2e832708f3 docs: changelog [Unreleased] — SLM live monitoring (fan-out feed + cache-first reads)
Captures the whole feat/slm-live-monitor effort (12 commits): fan-out /monitor
feed consumption, L1/L10 chart lines, live-chart backfill, cache-populated live
panel with measuring/freshness, per-unit + panel refresh, admin keepalive toggle,
dashboard/command-center cache reads, and the dispatchEvent/CancelledError/
freshness fixes. Targets 0.14.0; Upgrade Notes flag the paired SLMM `dev` build +
its migrations.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 20:17:10 +00:00
serversdown 5e3645e229 fix(slm): show real cache freshness in device list + draw L1/L10 on card chart
1. "No recent check-in" was always shown because the row's last-check text read
   unit.slm_last_check (a Terra-View roster field the monitor never updates),
   while the live freshness lives in SLMM's cached NL43Status.last_seen. Carry
   that last_seen onto the unit (unit.cache_last_seen) and display it (falling
   back to slm_last_check). Also treat "Measure" as Measuring in the badge, to
   match the panel and the cache's MEASURING_STATES.

2. The dashboard card chart only had Lp + Leq datasets, so L1/L10 never drew even
   though the cards showed them. Add L1 (purple) and L10 (orange) datasets and
   feed ln1/ln2 in both the /history backfill and the live /monitor frames.
   Percentiles parse via numOrNull so a missing "-.-" leaves a gap (spanGaps)
   instead of dropping the line to 0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 19:52:57 +00:00
serversdown 88f258d1c7 chore: rename terra-view container for clarity. 2026-06-10 19:45:55 +00:00