From 43b8e53d2d7961ffb68c0313ce46c7feb7d1c8e5 Mon Sep 17 00:00:00 2001 From: serversdown Date: Mon, 22 Jun 2026 20:54:43 +0000 Subject: [PATCH] chore: version bump --- CHANGELOG.md | 46 +++++++++++++++++++++++++++ README.md | 87 ++++++++++++++++++++++++++++++++++++++++++++++++---- app/main.py | 2 +- 3 files changed, 128 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d83bab1..35fcc12 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,52 @@ All notable changes to SLMM (Sound Level Meter Manager) will be documented in th The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.0] - 2026-06-22 + +### Added + +#### Live Monitor (fan-out feed) +- **Per-device fan-out monitor** - one shared, cached live feed per device. Multiple clients (dashboards, portal, charts) subscribe to the same stream instead of each fighting for the NL-43's single TCP connection: one poller reads the device, all subscribers get the same frames. +- **WebSocket monitor** - `WS /api/nl43/{unit_id}/monitor` delivers an instant first frame from cache, then live updates. +- **Monitor control** - `POST /api/nl43/{unit_id}/monitor/{start|stop}`, `GET /api/nl43/_monitor/status`. A persistent `monitor_enabled` flag auto-starts the keepalive on boot. +- **Adaptive polling** - poll rate adapts to demand; unreachable devices back off; a device-offline alert fires when a monitored unit drops. +- **De-duplication** - the background poller skips units already covered by an active monitor (no double-polling); a heartbeat keeps the feed warm. +- **Lower latency** - the monitor caches run state, roughly halving live-feed latency; fan-out emits an instant first frame + offline status to new clients. + +#### Alert Engine +- **Threshold rules** - per-device alert rules (metric + threshold + cooldown) with full CRUD: `POST/GET/PUT/DELETE /api/nl43/{unit_id}/alerts/rules[/{rule_id}]`. +- **Events + state machine** - onset/clear tracking via `GET /api/nl43/{unit_id}/alerts/events`; acknowledge with `POST .../events/{event_id}/ack`. A `cooldown_s` is enforced between onsets. +- **24/7 evaluation** - enabled rules pin the monitor on, so rules evaluate continuously even with no UI client connected. +- **Resilience** - editing or deleting a rule resets its state and closes any open event; device-offline events are raised when a monitored unit goes unreachable. + +#### Data & History +- **Live-chart backfill** - a downsampled DOD trail is persisted to a new `nl43_readings` table, exposed via `GET /api/nl43/{unit_id}/history` so charts can backfill recent history on load. +- **LN1/LN2 percentiles** - L1/L10 (configurable percentiles) surfaced through SLMM in the status and live-feed payloads. +- **measurement_start_time** included in the cached `/status` response. + +#### Device control +- **Per-device disconnect** - `POST /api/nl43/{unit_id}/disconnect` drops a device's pooled connection. +- **Deactivate / standby** - `POST /api/nl43/{unit_id}/deactivate` and global `POST /api/nl43/_system/standby` to quiesce polling/monitoring. + +### Changed +- **DRD streaming reuses the pooled connection** rather than opening a separate socket, avoiding contention with the persistent pool on a single-connection device. +- **Connection pool** - idle-TTL / max-age checks can now be disabled; pool status is logged periodically. + +### Fixed +- **Measurement-start confirmation** - `/start` now recognizes the device's `Start` state. It previously waited for `Measure`, which never matched, so the start cycle ran the full retry loop and Terra-View's proxy timed out with a misleading "Unknown error" even though the device had started. +- **Garbled reads** - corrupted measurement-state reads that produced phantom STOPPED/STARTED transitions are now ignored. +- **DOD parsing** - corrected field parsing and stopped spurious measurement-time resets. +- **Monitor WebSocket** - quieted a send-after-close race on client disconnect. + +### Database +- **New tables** (auto-created on startup via `Base.metadata.create_all`): `alert_rules`, `alert_events`, `nl43_readings`. +- **Migrations for existing tables** (run once per database): `migrate_add_ln_percentiles.py` (LN1/LN2 on `nl43_status`), `migrate_add_monitor_enabled.py` (`monitor_enabled` on `nl43_config`). + +### Notes +- Pairs with the matching Terra-View `dev` build, which reads SLMM's `/monitor` fan-out feed for live SLM dashboards (L1/L10 lines, live-chart backfill). Ship the two together. + +--- + ## [0.3.0] - 2026-02-17 ### Added diff --git a/README.md b/README.md index 441a1e6..115c645 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # SLMM - Sound Level Meter Manager -**Version 0.3.0** +**Version 0.4.0** Backend API service for controlling and monitoring Rion NL-43/NL-53 Sound Level Meters via TCP and FTP protocols. @@ -12,6 +12,9 @@ SLMM is a standalone backend module that provides REST API routing and command t ## Features +- **Live Monitor (fan-out)**: One shared cached live feed per device — many clients subscribe to the same stream instead of fighting over the meter's single TCP connection +- **Alert Engine**: Per-device threshold rules with onset/clear events, cooldowns, acks, and 24/7 evaluation +- **History & Percentiles**: Downsampled DOD trail + history endpoint for live-chart backfill; LN1/LN2 (L1/L10) percentiles surfaced through the feed - **Persistent TCP Connections**: Cached per-device connections with OS-level keepalive, tuned for cellular modem reliability - **Background Polling**: Continuous automatic polling of devices with configurable intervals - **Offline Detection**: Automatic device reachability tracking with failure counters @@ -44,6 +47,30 @@ SLMM is a standalone backend module that provides REST API routing and command t └──────────────┘ ``` +### Live Monitor — Fan-Out Feed (v0.4.0) + +The NL-43 allows only one TCP control connection at a time, so multiple clients +polling the same device directly would contend for it. The monitor solves this +with a single shared, cached feed per device: + +- **One reader, many subscribers**: a single poller reads the device; every + WebSocket subscriber (`WS /api/nl43/{unit_id}/monitor`) receives the same + frames — an instant first frame from cache, then live updates. +- **Persistent + auto-start**: a `monitor_enabled` flag keeps the feed running + and auto-starts it on boot. Enabled alert rules pin the monitor on for 24/7 + evaluation even with no UI connected. +- **Adaptive & deduplicated**: poll rate adapts to demand, unreachable devices + back off, and the background poller skips units already covered by a monitor. + +### Alert Engine (v0.4.0) + +Per-device threshold alerting evaluated against the live feed: + +- **Rules**: metric + threshold + `cooldown_s`, full CRUD per device +- **Events**: onset/clear state machine, acknowledgement, and a device-offline + alert when a monitored unit drops +- **Robust**: editing/deleting a rule resets its state and closes open events + ### Persistent TCP Connection Pool (v0.3.0) SLMM maintains persistent TCP connections to devices with OS-level keepalive, designed for reliable operation over cellular modems: @@ -145,8 +172,32 @@ Logs are written to: |--------|----------|-------------| | GET | `/api/nl43/{unit_id}/status` | Get cached measurement snapshot (updated by background poller) | | GET | `/api/nl43/{unit_id}/live` | Request fresh DOD data from device (bypasses cache) | +| GET | `/api/nl43/{unit_id}/history` | Downsampled DOD trail for live-chart backfill | | WS | `/api/nl43/{unit_id}/stream` | WebSocket stream for real-time DRD data | +### Live Monitor (fan-out feed) + +| Method | Endpoint | Description | +|--------|----------|-------------| +| WS | `/api/nl43/{unit_id}/monitor` | Subscribe to the shared cached live feed (instant first frame) | +| POST | `/api/nl43/{unit_id}/monitor/start` | Start the device's monitor feed | +| POST | `/api/nl43/{unit_id}/monitor/stop` | Stop the device's monitor feed | +| GET | `/api/nl43/_monitor/status` | Global monitor status across devices | +| POST | `/api/nl43/{unit_id}/disconnect` | Drop the device's pooled TCP connection | +| POST | `/api/nl43/{unit_id}/deactivate` | Quiesce polling/monitoring for one device | +| POST | `/api/nl43/_system/standby` | Global standby — quiesce all polling/monitoring | + +### Alerts + +| Method | Endpoint | Description | +|--------|----------|-------------| +| GET | `/api/nl43/{unit_id}/alerts/rules` | List alert rules for a device | +| POST | `/api/nl43/{unit_id}/alerts/rules` | Create an alert rule (metric, threshold, cooldown) | +| PUT | `/api/nl43/{unit_id}/alerts/rules/{rule_id}` | Update a rule (resets its state, closes open events) | +| DELETE | `/api/nl43/{unit_id}/alerts/rules/{rule_id}` | Delete a rule | +| GET | `/api/nl43/{unit_id}/alerts/events` | List alert events (onset/clear) | +| POST | `/api/nl43/{unit_id}/alerts/events/{event_id}/ack` | Acknowledge an event | + ### Background Polling | Method | Endpoint | Description | @@ -273,11 +324,35 @@ Caches latest measurement snapshot: - `sd_remaining_mb`: Free SD card space (MB) - `sd_free_ratio`: SD card free space ratio - `raw_payload`: Raw device response data -- `is_reachable`: Device reachability status (Boolean) ⭐ NEW -- `consecutive_failures`: Count of consecutive poll failures ⭐ NEW -- `last_poll_attempt`: Last time background poller attempted to poll ⭐ NEW -- `last_success`: Last successful poll timestamp ⭐ NEW -- `last_error`: Last error message (truncated to 500 chars) ⭐ NEW +- `is_reachable`: Device reachability status (Boolean) +- `consecutive_failures`: Count of consecutive poll failures +- `last_poll_attempt`: Last time background poller attempted to poll +- `last_success`: Last successful poll timestamp +- `last_error`: Last error message (truncated to 500 chars) +- `ln1` / `ln2`: LN1/LN2 (L1/L10) percentile levels ⭐ v0.4.0 + +### NL43Readings Table ⭐ v0.4.0 +Downsampled DOD trail backing the live-chart history endpoint (one row/minute, +pruned to a retention window — viewing only, not the report source): +- `id` (PK), `unit_id`, `timestamp` +- `lp` / `leq` / `lmax` / `ln1` / `ln2`: cached level samples + +### AlertRule Table ⭐ v0.4.0 +Per-device threshold alert rules: +- `id` (PK), `unit_id`, `name`, `enabled` +- `metric`, `comparison` (above/below), `threshold_db`, `clear_margin_db` (hysteresis) +- `duration_s` (sustained), `cooldown_s` (min seconds between onsets) +- `channels` / `recipients`, optional `schedule_start`/`schedule_end`/`schedule_days` + +### AlertEvent Table ⭐ v0.4.0 +Alert onset/clear events for history, inbox, and acknowledgement: +- `id` (PK), `unit_id`, `rule_id`, `rule_name`, `metric`, `threshold_db` +- `onset_at` / `onset_value`, `peak_value`, `clear_at`, `status` (active/cleared) +- `acknowledged_at` / `acknowledged_by`, `notes` + +> New tables (`alert_rules`, `alert_events`, `nl43_readings`) auto-create on +> startup. Existing-table columns ship with migrations: +> `migrate_add_ln_percentiles.py`, `migrate_add_monitor_enabled.py`. ## Protocol Details diff --git a/app/main.py b/app/main.py index 74f94c7..d0b6145 100644 --- a/app/main.py +++ b/app/main.py @@ -71,7 +71,7 @@ async def lifespan(app: FastAPI): app = FastAPI( title="SLMM NL43 Addon", description="Standalone module for NL43 configuration and status APIs with background polling", - version="0.3.0", + version="0.4.0", lifespan=lifespan, )