• v0.4.0 89b6892656

    v0.4.0 Stable

    serversdown released this 2026-06-22 19:00:49 -04:00 | 0 commits to main since this release

    [0.4.0] - 2026-06-22

    Added

    Live Monitor (fan-out feed)

    • Per-device fan-out monitor - one shared, cached live feed per device. Multiple clients (dashboards, portal, charts) subscribe to the same stream instead of each fighting for the NL-43's single TCP connection: one poller reads the device, all subscribers get the same frames.
    • WebSocket monitor - WS /api/nl43/{unit_id}/monitor delivers an instant first frame from cache, then live updates.
    • Monitor control - POST /api/nl43/{unit_id}/monitor/{start|stop}, GET /api/nl43/_monitor/status. A persistent monitor_enabled flag auto-starts the keepalive on boot.
    • Adaptive polling - poll rate adapts to demand; unreachable devices back off; a device-offline alert fires when a monitored unit drops.
    • De-duplication - the background poller skips units already covered by an active monitor (no double-polling); a heartbeat keeps the feed warm.
    • Lower latency - the monitor caches run state, roughly halving live-feed latency; fan-out emits an instant first frame + offline status to new clients.

    Alert Engine

    • Threshold rules - per-device alert rules (metric + threshold + cooldown) with full CRUD: POST/GET/PUT/DELETE /api/nl43/{unit_id}/alerts/rules[/{rule_id}].
    • Events + state machine - onset/clear tracking via GET /api/nl43/{unit_id}/alerts/events; acknowledge with POST .../events/{event_id}/ack. A cooldown_s is enforced between onsets.
    • 24/7 evaluation - enabled rules pin the monitor on, so rules evaluate continuously even with no UI client connected.
    • Resilience - editing or deleting a rule resets its state and closes any open event; device-offline events are raised when a monitored unit goes unreachable.

    Data & History

    • Live-chart backfill - a downsampled DOD trail is persisted to a new nl43_readings table, exposed via GET /api/nl43/{unit_id}/history so charts can backfill recent history on load.
    • LN1/LN2 percentiles - L1/L10 (configurable percentiles) surfaced through SLMM in the status and live-feed payloads.
    • measurement_start_time included in the cached /status response.

    Device control

    • Per-device disconnect - POST /api/nl43/{unit_id}/disconnect drops a device's pooled connection.
    • Deactivate / standby - POST /api/nl43/{unit_id}/deactivate and global POST /api/nl43/_system/standby to quiesce polling/monitoring.

    Changed

    • DRD streaming reuses the pooled connection rather than opening a separate socket, avoiding contention with the persistent pool on a single-connection device.
    • Connection pool - idle-TTL / max-age checks can now be disabled; pool status is logged periodically.

    Fixed

    • Measurement-start confirmation - /start now recognizes the device's Start state. It previously waited for Measure, which never matched, so the start cycle ran the full retry loop and Terra-View's proxy timed out with a misleading "Unknown error" even though the device had started.
    • Garbled reads - corrupted measurement-state reads that produced phantom STOPPED/STARTED transitions are now ignored.
    • DOD parsing - corrected field parsing and stopped spurious measurement-time resets.
    • Monitor WebSocket - quieted a send-after-close race on client disconnect.

    Database

    • New tables (auto-created on startup via Base.metadata.create_all): alert_rules, alert_events, nl43_readings.
    • Migrations for existing tables (run once per database): migrate_add_ln_percentiles.py (LN1/LN2 on nl43_status), migrate_add_monitor_enabled.py (monitor_enabled on nl43_config).

    Notes

    • Pairs with the matching Terra-View dev build, which reads SLMM's /monitor fan-out feed for live SLM dashboards (L1/L10 lines, live-chart backfill). Ship the two together.

    Downloads