-
released this
2026-06-17 16:43:32 -04:00 | 0 commits to main since this releaseSLM live monitoring — fan-out feed + cache-first reads. Targets 0.14.0. The throughline: the NL-43 allows exactly one TCP connection at a time, so every page that opened its own device stream (or sent its own
Measure?/DOD on load) was competing for that single connection — a second viewer saw nothing, and dashboard loads stole polling resolution from the live feed. This release moves Terra-View entirely onto SLMM's shared, cached monitoring: one DOD poll loop per device, fanned out to all viewers; dashboards read SLMM's cache (a DB read on SLMM's side) instead of touching the device; and the live panels populate instantly from cache on open, upgrading to the live WS only on demand. Paired with the SLMM-side work (adaptive poll rate, unreachable backoff, device-offline alert) on SLMM branchdev.Added
- Fan-out
/monitorfeed consumption. The unit live view (partials/slm_live_view.html) and the dashboard live tile (sound_level_meters.html) now subscribe to SLMM's shared per-device monitor overWS /api/slmm/{unit}/monitorinstead of each opening its own device stream. Any number of clients attach without each consuming the NL-43's single connection — the "second viewer sees nothing" contention is gone. A WS proxy handler for/monitorwas added tobackend/routers/slmm.py. - L1/L10 percentile lines + cards. Both the per-unit live chart and the dashboard card chart now plot L1 (purple) and L10 (orange) alongside Lp/Leq, and the KPI cards show L1/L10. Sourced from the DOD feed's
ln1/ln2(DRD streaming can't carry percentiles, DOD can). Missing/-.-values leave a gap rather than dropping the line to 0. - Live-chart backfill on open. Charts seed from SLMM's downsampled DOD trail (
GET /api/slmm/{unit}/history?hours=2) so a viewer sees recent trend immediately instead of a blank chart that fills one point per second. - Live Measurements panel auto-populates from cache. Opening the dashboard panel fills the KPI cards from cached
/statusand backfills the chart from/history— pure cache reads, no device hit. Shows a measuring badge (● Measuring / ■ Stopped) and a freshness stamp ("as of 3:48 PM (10s ago)", amber + "cached" when stale). Re-polls the cache every 15s while open; Start Live Stream upgrades to the live WS and no longer wipes the backfilled trail (chart point cap raised 60 → 600). - Refresh buttons — one per device-list row, one in the panel header. On-demand, user-initiated single device read via
GET /api/slmm/{unit}/live(which also refreshes SLMM's cache), with a spinner + success/error toast, then reloads the device list. - Per-unit live-monitoring (keepalive) toggle on
/admin/slmm— turns a device's server-side keepalive feed on/off (POST /monitor/start|stop), so alerting can keep a device's feed running with no browser attached.
Changed
- Dashboard device list + command center read SLMM's cache, not the device.
slm_dashboard.py'sget_slm_unitspulls each unit's cached status from SLMM's/roster(one call, a SLMM DB read) for the badge + freshness; the command-centerget_live_viewreads cached/statusinstead of sendingMeasure?+ a fresh DOD on every load. This stops dashboard loads from stealing the device's single connection from the live monitor. The elapsed-measurement timer still works becausemeasurement_start_timeis now included in the cached/statusresponse. - Device-list freshness reflects real monitoring. The "Last check" line now uses SLMM's cached
last_seen(which the monitor advances on every successful poll) viaunit.cache_last_seen, instead of theslm_last_checkroster field the monitor never updates. The status badge also treatsMeasureas Measuring, matching the panel and SLMM's cache. - Status badge relocated to the card's bottom meta row (next to "Last check"), off the top-right corner where it collided with the chart/gear/refresh action icons.
Fixed
- Deploy/bench threw
can't access property "dispatchEvent", e is null.toggleSLMDeployed()and the save-config path calledhtmx.trigger('#slm-list', 'load')guarded only bytypeof htmx !== 'undefined'; no page has a#slm-list, so htmx resolved null and callednull.dispatchEvent(...). The deploy POST had already succeeded, so the operator saw both the green success and a red error. Both call sites now guard on the element existing (slm_settings_modal.html). - Monitor WS proxy leaked
CancelledError/ "task exception never retrieved" on stream stop — the cleanup awaited pending tasks but only caughtException, missingCancelledError(aBaseException). - "No recent check-in" shown even on an actively-monitored device — the row read the stale
slm_last_checkroster field instead of SLMM's live cache (see Changed). - L1/L10 KPI cards populated but the chart drew no L1/L10 lines — the card chart only had Lp + Leq datasets.
Upgrade Notes
Requires the matching SLMM build (branch
dev) — Terra-View now depends on SLMM's fan-out/monitorfeed,/historytrail,/statuscarryingln1/ln2+measurement_start_time, cached/rosterstatus, and themonitor_enabledkeepalive flag.# SLMM (branch dev) — REBUILD + MIGRATE (or you'll get `no such column: nl43_status.ln1` 500s) cd /home/serversdown/slmm && docker compose build slmm && docker compose up -d slmm docker exec terra-view-slmm-1 python3 migrate_add_ln_percentiles.py docker exec terra-view-slmm-1 python3 migrate_add_monitor_enabled.py # Terra-View — NO migration; templates are baked into the image, so rebuild (don't just restart) cd /home/serversdown/terra-view && docker compose build terra-view && docker compose up -d terra-viewThe two builds must ship together. Note the
docker-compose.ymlcontainer was renamed for clarity (nowterra-view-terra-view-1) — adjust anydocker execscripts that referenced the old name.
Client portal (new — read-only client-facing view)
A scoped, read-only portal at
/portal/*where a client sees only their
locations, live. Built inside Terra-View (no new service), reusing the cached
SLMM feed; every route resolves the client through one swappable
get_current_clientgate, so the interim magic/open-link auth can be replaced
(M4) without touching routes or templates. Strictly read-only — no device control.Added
- Per-client scoping + interim auth. New
Client,ClientAccessToken, and a
Project.client_idFK. A signed (HMAC) session cookie carries the access-token
id, re-validated against the DB each request (revoke kills live sessions, with
server-side expiry). Entry via a magic link (/portal/enter/{token}) or a
dev-only plain link (/portal/open/{id},PORTAL_OPEN_LINKS, default off). - Live location view. KPI cards (Lp/Leq/Lmax/L1/L10) + chart populate
instantly from cache, then upgrade to a real ~1 Hz WebSocket stream scoped to
the client's unit (a scrubbed bridge to the SLMM fan-out feed). The stream
auto-closes when the tab is hidden (Page Visibility) and after a 15-min idle
cap, so an abandoned tab can't pin the device at 1 Hz / burn cellular. - Locations overview. Live status map (level-colored dots, dark/light CARTO
tiles) + a status rollup (live/offline counts, "loudest now"). Leq is the
headline metric. - Alerts (config → surface → 24/7). Threshold-rule config on the SLM detail
page (proxying SLMM's alert CRUD); breach history + ack internally and a
read-only, scrubbed history + current-alarm banner + "your alert limits" panel
in the portal; enabling a rule pins that device's monitor on so alerts evaluate
round-the-clock. - Operator sharing tools. A "View client portal" preview button and a
"Copy client link" modal (mint / list / revoke magic links) on the project
page, plus abackend/portal_admin.pyCLI. - Field-instrument design. Distinctive themed portal — Hanken Grotesk UI +
IBM Plex Mono readouts, panel system, pulsing live dot, staggered reveal — with a
light/dark toggle (light default, persisted, no-flash).
Security
- All scoping enforced server-side (404-not-403, no existence leak); client
endpoints return scrubbed projections (no device-health/internal ids); WS
frames whitelisted; operator-set strings HTML-escaped before injection (XSS).
Pre-merge code review hardened cookie expiry, open-links default, and the slug
collision. Remaining hardening (reverse proxy, TLS,SECRET_KEY, M4 auth) is
tracked indocs/CLIENT_PORTAL.md→ "Security hardening backlog".
Upgrade Notes
- Migration:
docker compose exec web-app python3 backend/migrate_add_client_portal.py
(addsprojects.client_id; theclients/client_access_tokenstables
auto-create). - Set a real
SECRET_KEYin any internet-facing env (signs session cookies),
and keepPORTAL_OPEN_LINKS=falsethere. - Portal alerts depend on the SLMM
devalert engine (rules/events/evaluator +
cooldown + keepalive coupling) — same build pairing as above.
Portal authentication (Phase 1)
- Each project's client portal is now gated by a secure per-project link + shared password (argon2-hashed). Operators manage it from the project page's Portal access panel (enable, generate password, copy link).
- Per-project session isolation (a session for one project can't read another's data); brute-force lockout (5 tries / 15 min) on the password gate.
- Retired the interim magic-link /
PORTAL_OPEN_LINKSopen links and theportal_admin.py mint-linkcommand. - Upgrade: new
argon2-cffidependency → rebuild the image, then runpython3 backend/migrate_add_project_portal_auth.pyper DB (adds theprojects.portal_*columns).SECRET_KEYandCOOKIE_SECUREare now passed through indocker-compose.yml(settable via a.envfile) — set a realSECRET_KEY(andCOOKIE_SECURE=trueonce on HTTPS) before the portal faces the internet.
Downloads
- Fan-out