Compare commits
54 Commits
d2b47156d8
...
dev
| Author | SHA1 | Date | |
|---|---|---|---|
| 43b8e53d2d | |||
| 6d1c426ee4 | |||
| ad6071b790 | |||
| cfdeada9d6 | |||
| b51fefca2b | |||
| 5bc542e92f | |||
| 1f5f1fb1f6 | |||
| b4cea2f287 | |||
| d1d694302c | |||
| 43e72ae3c3 | |||
| 9d34779171 | |||
| 87c06f1519 | |||
| ba622c67d8 | |||
| 6b1ec75396 | |||
| 9c43e68534 | |||
| aa3e088b64 | |||
| 8c17af4849 | |||
| b954eb8c89 | |||
| 0793e7df01 | |||
| 51dd6b682d | |||
| a7983d2958 | |||
| d6dd2e736b | |||
| af86cf713e | |||
| e3f9ca7f5b | |||
| 450509d210 | |||
| fefa9eace8 | |||
| 98a8d357e5 | |||
| 0a7422eceb | |||
| 996b993cb9 | |||
| 01337696b3 | |||
| a302fd15d4 | |||
| af5ecc1a92 | |||
| ad1a40e0aa | |||
| b62e84f8b3 | |||
| a5f8d1b2c7 | |||
| a1a80bbb4d | |||
| 005e0091fe | |||
| e6ac80df6c | |||
| 7070b948a8 | |||
| 3b6e9ad3f0 | |||
| eb0cbcc077 | |||
| cc0a5bdf84 | |||
| bf5f222511 | |||
| eb39a9d1d0 | |||
| 67d63b4173 | |||
| 25cf9528d0 | |||
| 738ad7878e | |||
| 152377d608 | |||
| 4868381053 | |||
| b4bbfd2b01 | |||
| 82651f71b5 | |||
| 182920809d | |||
| 2a3589ca5c | |||
| d43ef7427f |
@@ -1,5 +1,8 @@
|
|||||||
/manuals/
|
/manuals/
|
||||||
/data/
|
/data/
|
||||||
|
/data-dev/
|
||||||
|
/SLM-stress-test/stress_test_logs/
|
||||||
|
/SLM-stress-test/tcpdump-runs/
|
||||||
|
|
||||||
# Python cache
|
# Python cache
|
||||||
__pycache__/
|
__pycache__/
|
||||||
@@ -12,3 +15,5 @@ __pycache__/
|
|||||||
*.egg-info/
|
*.egg-info/
|
||||||
dist/
|
dist/
|
||||||
build/
|
build/
|
||||||
|
|
||||||
|
*.pcap
|
||||||
+251
@@ -0,0 +1,251 @@
|
|||||||
|
# Changelog
|
||||||
|
|
||||||
|
All notable changes to SLMM (Sound Level Meter Manager) will be documented in this file.
|
||||||
|
|
||||||
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
|
## [0.4.0] - 2026-06-22
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
#### Live Monitor (fan-out feed)
|
||||||
|
- **Per-device fan-out monitor** - one shared, cached live feed per device. Multiple clients (dashboards, portal, charts) subscribe to the same stream instead of each fighting for the NL-43's single TCP connection: one poller reads the device, all subscribers get the same frames.
|
||||||
|
- **WebSocket monitor** - `WS /api/nl43/{unit_id}/monitor` delivers an instant first frame from cache, then live updates.
|
||||||
|
- **Monitor control** - `POST /api/nl43/{unit_id}/monitor/{start|stop}`, `GET /api/nl43/_monitor/status`. A persistent `monitor_enabled` flag auto-starts the keepalive on boot.
|
||||||
|
- **Adaptive polling** - poll rate adapts to demand; unreachable devices back off; a device-offline alert fires when a monitored unit drops.
|
||||||
|
- **De-duplication** - the background poller skips units already covered by an active monitor (no double-polling); a heartbeat keeps the feed warm.
|
||||||
|
- **Lower latency** - the monitor caches run state, roughly halving live-feed latency; fan-out emits an instant first frame + offline status to new clients.
|
||||||
|
|
||||||
|
#### Alert Engine
|
||||||
|
- **Threshold rules** - per-device alert rules (metric + threshold + cooldown) with full CRUD: `POST/GET/PUT/DELETE /api/nl43/{unit_id}/alerts/rules[/{rule_id}]`.
|
||||||
|
- **Events + state machine** - onset/clear tracking via `GET /api/nl43/{unit_id}/alerts/events`; acknowledge with `POST .../events/{event_id}/ack`. A `cooldown_s` is enforced between onsets.
|
||||||
|
- **24/7 evaluation** - enabled rules pin the monitor on, so rules evaluate continuously even with no UI client connected.
|
||||||
|
- **Resilience** - editing or deleting a rule resets its state and closes any open event; device-offline events are raised when a monitored unit goes unreachable.
|
||||||
|
|
||||||
|
#### Data & History
|
||||||
|
- **Live-chart backfill** - a downsampled DOD trail is persisted to a new `nl43_readings` table, exposed via `GET /api/nl43/{unit_id}/history` so charts can backfill recent history on load.
|
||||||
|
- **LN1/LN2 percentiles** - L1/L10 (configurable percentiles) surfaced through SLMM in the status and live-feed payloads.
|
||||||
|
- **measurement_start_time** included in the cached `/status` response.
|
||||||
|
|
||||||
|
#### Device control
|
||||||
|
- **Per-device disconnect** - `POST /api/nl43/{unit_id}/disconnect` drops a device's pooled connection.
|
||||||
|
- **Deactivate / standby** - `POST /api/nl43/{unit_id}/deactivate` and global `POST /api/nl43/_system/standby` to quiesce polling/monitoring.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- **DRD streaming reuses the pooled connection** rather than opening a separate socket, avoiding contention with the persistent pool on a single-connection device.
|
||||||
|
- **Connection pool** - idle-TTL / max-age checks can now be disabled; pool status is logged periodically.
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
- **Measurement-start confirmation** - `/start` now recognizes the device's `Start` state. It previously waited for `Measure`, which never matched, so the start cycle ran the full retry loop and Terra-View's proxy timed out with a misleading "Unknown error" even though the device had started.
|
||||||
|
- **Garbled reads** - corrupted measurement-state reads that produced phantom STOPPED/STARTED transitions are now ignored.
|
||||||
|
- **DOD parsing** - corrected field parsing and stopped spurious measurement-time resets.
|
||||||
|
- **Monitor WebSocket** - quieted a send-after-close race on client disconnect.
|
||||||
|
|
||||||
|
### Database
|
||||||
|
- **New tables** (auto-created on startup via `Base.metadata.create_all`): `alert_rules`, `alert_events`, `nl43_readings`.
|
||||||
|
- **Migrations for existing tables** (run once per database): `migrate_add_ln_percentiles.py` (LN1/LN2 on `nl43_status`), `migrate_add_monitor_enabled.py` (`monitor_enabled` on `nl43_config`).
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
- Pairs with the matching Terra-View `dev` build, which reads SLMM's `/monitor` fan-out feed for live SLM dashboards (L1/L10 lines, live-chart backfill). Ship the two together.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [0.3.0] - 2026-02-17
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
#### Persistent TCP Connection Pool
|
||||||
|
- **Connection reuse** - TCP connections are cached per device and reused across commands, eliminating repeated TCP handshakes over cellular modems
|
||||||
|
- **OS-level TCP keepalive** - Configurable keepalive probes keep cellular NAT tables alive and detect dead connections early (default: probe after 15s idle, every 10s, 3 failures = dead)
|
||||||
|
- **Transparent retry** - If a cached connection goes stale, the system automatically retries with a fresh connection so failures are never visible to the caller
|
||||||
|
- **Stale connection detection** - Multi-layer detection via idle TTL, max age, transport state, and reader EOF checks
|
||||||
|
- **Background cleanup** - Periodic task (every 30s) evicts expired connections from the pool
|
||||||
|
- **Master switch** - Set `TCP_PERSISTENT_ENABLED=false` to revert to per-request connection behavior
|
||||||
|
|
||||||
|
#### Connection Pool Diagnostics
|
||||||
|
- `GET /api/nl43/_connections/status` - View pool configuration, active connections, age/idle times, and keepalive settings
|
||||||
|
- `POST /api/nl43/_connections/flush` - Force-close all cached connections (useful for debugging)
|
||||||
|
- **Connections tab on roster page** - Live UI showing pool config, active connections with age/idle/alive status, auto-refreshes every 5s, and flush button
|
||||||
|
|
||||||
|
#### Environment Variables
|
||||||
|
- `TCP_PERSISTENT_ENABLED` (default: `true`) - Master switch for persistent connections
|
||||||
|
- `TCP_IDLE_TTL` (default: `300`) - Close idle connections after N seconds
|
||||||
|
- `TCP_MAX_AGE` (default: `1800`) - Force reconnect after N seconds
|
||||||
|
- `TCP_KEEPALIVE_IDLE` (default: `15`) - Seconds idle before keepalive probes start
|
||||||
|
- `TCP_KEEPALIVE_INTERVAL` (default: `10`) - Seconds between keepalive probes
|
||||||
|
- `TCP_KEEPALIVE_COUNT` (default: `3`) - Failed probes before declaring connection dead
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- **Health check endpoint** (`/health/devices`) - Now uses connection pool instead of opening throwaway TCP connections; checks for existing live connections first (zero-cost), only opens new connection through pool if needed
|
||||||
|
- **Diagnostics endpoint** - Removed separate port 443 modem check (extra handshake waste); TCP reachability test now uses connection pool
|
||||||
|
- **DRD streaming** - Streaming connections now get TCP keepalive options set; cached connections are evicted before opening dedicated streaming socket
|
||||||
|
- **Default timeouts tuned for cellular** - Idle TTL raised to 300s (5 min), max age raised to 1800s (30 min) to survive typical polling intervals over cellular links
|
||||||
|
|
||||||
|
### Technical Details
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- `ConnectionPool` class in `services.py` manages a single cached connection per device key (NL-43 only supports one TCP connection at a time)
|
||||||
|
- Uses existing per-device asyncio locks and rate limiting — no changes to concurrency model
|
||||||
|
- Pool is a module-level singleton initialized from environment variables at import time
|
||||||
|
- Lifecycle managed via FastAPI lifespan: cleanup task starts on startup, all connections closed on shutdown
|
||||||
|
- `_send_command_unlocked()` refactored to use acquire/release/discard pattern with single-retry fallback
|
||||||
|
- Command parsing extracted to `_execute_command()` method for reuse between primary and retry paths
|
||||||
|
|
||||||
|
#### Cellular Modem Optimizations
|
||||||
|
- Keepalive probes at 15s prevent cellular NAT tables from expiring (typically 30-60s timeout)
|
||||||
|
- 300s idle TTL ensures connections survive between polling cycles (default 60s interval)
|
||||||
|
- 1800s max age allows a single socket to serve ~30 minutes of polling before forced reconnect
|
||||||
|
- Health checks and diagnostics produce zero additional TCP handshakes when a pooled connection exists
|
||||||
|
- Stale `$` prompt bytes drained from idle connections before command reuse
|
||||||
|
|
||||||
|
### Breaking Changes
|
||||||
|
None. This release is fully backward-compatible with v0.2.x. Set `TCP_PERSISTENT_ENABLED=false` for identical behavior to previous versions.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [0.2.1] - 2026-01-23
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- **Roster management**: UI and API endpoints for managing device rosters.
|
||||||
|
- **Delete config endpoint**: Remove device configuration alongside cached status data.
|
||||||
|
- **Scheduler hooks**: `start_cycle` and `stop_cycle` helpers for Terra-View scheduling integration.
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- **FTP logging**: Connection, authentication, and transfer phases now log explicitly.
|
||||||
|
- **Documentation**: Reorganized docs/scripts and updated API notes for FTP/TCP verification.
|
||||||
|
|
||||||
|
## [0.2.0] - 2026-01-15
|
||||||
|
|
||||||
|
### Added
|
||||||
|
|
||||||
|
#### Background Polling System
|
||||||
|
- **Continuous automatic device polling** - Background service that continuously polls configured devices
|
||||||
|
- **Per-device configurable intervals** - Each device can have custom polling interval (10-3600 seconds, default 60)
|
||||||
|
- **Automatic offline detection** - Devices automatically marked unreachable after 3 consecutive failures
|
||||||
|
- **Reachability tracking** - Database fields track device health with failure counters and error messages
|
||||||
|
- **Dynamic sleep scheduling** - Polling service adjusts sleep intervals based on device configurations
|
||||||
|
- **Graceful lifecycle management** - Background poller starts on application startup and stops cleanly on shutdown
|
||||||
|
|
||||||
|
#### New API Endpoints
|
||||||
|
- `GET /api/nl43/{unit_id}/polling/config` - Get device polling configuration
|
||||||
|
- `PUT /api/nl43/{unit_id}/polling/config` - Update polling interval and enable/disable per-device polling
|
||||||
|
- `GET /api/nl43/_polling/status` - Get global polling status for all devices with reachability info
|
||||||
|
|
||||||
|
#### Database Schema Changes
|
||||||
|
- **NL43Config table**:
|
||||||
|
- `poll_interval_seconds` (Integer, default 60) - Polling interval in seconds
|
||||||
|
- `poll_enabled` (Boolean, default true) - Enable/disable background polling per device
|
||||||
|
|
||||||
|
- **NL43Status table**:
|
||||||
|
- `is_reachable` (Boolean, default true) - Current device reachability status
|
||||||
|
- `consecutive_failures` (Integer, default 0) - Count of consecutive poll failures
|
||||||
|
- `last_poll_attempt` (DateTime) - Last time background poller attempted to poll
|
||||||
|
- `last_success` (DateTime) - Last successful poll timestamp
|
||||||
|
- `last_error` (Text) - Last error message (truncated to 500 chars)
|
||||||
|
|
||||||
|
#### New Files
|
||||||
|
- `app/background_poller.py` - Background polling service implementation
|
||||||
|
- `migrate_add_polling_fields.py` - Database migration script for v0.2.0 schema changes
|
||||||
|
- `test_polling.sh` - Comprehensive test script for polling functionality
|
||||||
|
- `CHANGELOG.md` - This changelog file
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- **Enhanced status endpoint** - `GET /api/nl43/{unit_id}/status` now includes polling-related fields (is_reachable, consecutive_failures, last_poll_attempt, last_success, last_error)
|
||||||
|
- **Application startup** - Added lifespan context manager in `app/main.py` to manage background poller lifecycle
|
||||||
|
- **Performance improvement** - Terra-View requests now return cached data instantly (<100ms) instead of waiting for device queries (1-2 seconds)
|
||||||
|
|
||||||
|
### Technical Details
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
- Background poller runs as async task using `asyncio.create_task()`
|
||||||
|
- Uses existing `NL43Client` and `persist_snapshot()` functions - no code duplication
|
||||||
|
- Respects existing 1-second rate limiting per device
|
||||||
|
- Efficient resource usage - skips work when no devices configured
|
||||||
|
- WebSocket streaming remains unaffected - separate real-time data path
|
||||||
|
|
||||||
|
#### Default Behavior
|
||||||
|
- Existing devices automatically get 60-second polling interval
|
||||||
|
- Existing status records default to `is_reachable=true`
|
||||||
|
- Migration is additive-only - no data loss
|
||||||
|
- Polling can be disabled per-device via `poll_enabled=false`
|
||||||
|
|
||||||
|
#### Recommended Intervals
|
||||||
|
- Critical monitoring: 30 seconds
|
||||||
|
- Normal monitoring: 60 seconds (default)
|
||||||
|
- Battery conservation: 300 seconds (5 minutes)
|
||||||
|
- Development/testing: 10 seconds (minimum allowed)
|
||||||
|
|
||||||
|
### Migration Notes
|
||||||
|
|
||||||
|
To upgrade from v0.1.x to v0.2.0:
|
||||||
|
|
||||||
|
1. **Stop the service** (if running):
|
||||||
|
```bash
|
||||||
|
docker compose down slmm
|
||||||
|
# OR
|
||||||
|
# Stop your uvicorn process
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Update code**:
|
||||||
|
```bash
|
||||||
|
git pull
|
||||||
|
# OR copy new files
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Run migration**:
|
||||||
|
```bash
|
||||||
|
cd slmm
|
||||||
|
python3 migrate_add_polling_fields.py
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Restart service**:
|
||||||
|
```bash
|
||||||
|
docker compose up -d --build slmm
|
||||||
|
# OR
|
||||||
|
uvicorn app.main:app --host 0.0.0.0 --port 8100
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Verify polling is active**:
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8100/api/nl43/_polling/status | jq '.'
|
||||||
|
```
|
||||||
|
|
||||||
|
You should see `"poller_running": true` and all configured devices listed.
|
||||||
|
|
||||||
|
### Breaking Changes
|
||||||
|
None. This release is fully backward-compatible with v0.1.x. All existing endpoints and functionality remain unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [0.1.0] - 2025-12-XX
|
||||||
|
|
||||||
|
### Added
|
||||||
|
- Initial release
|
||||||
|
- REST API for NL43/NL53 sound level meter control
|
||||||
|
- TCP command protocol implementation
|
||||||
|
- FTP file download support
|
||||||
|
- WebSocket streaming for real-time data (DRD)
|
||||||
|
- Device configuration management
|
||||||
|
- Measurement control (start, stop, pause, resume, reset, store)
|
||||||
|
- Device information endpoints (battery, clock, results)
|
||||||
|
- Measurement settings management (frequency/time weighting)
|
||||||
|
- Sleep mode control
|
||||||
|
- Rate limiting (1-second minimum between commands)
|
||||||
|
- SQLite database for device configs and status cache
|
||||||
|
- Health check endpoints
|
||||||
|
- Comprehensive API documentation
|
||||||
|
- NL43 protocol documentation
|
||||||
|
|
||||||
|
### Database Schema (v0.1.0)
|
||||||
|
- **NL43Config table** - Device connection configuration
|
||||||
|
- **NL43Status table** - Measurement snapshot cache
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Version History Summary
|
||||||
|
|
||||||
|
- **v0.3.0** (2026-02-17) - Persistent TCP connections with keepalive for cellular modem reliability
|
||||||
|
- **v0.2.1** (2026-01-23) - Roster management, scheduler hooks, FTP logging, doc cleanup
|
||||||
|
- **v0.2.0** (2026-01-15) - Background Polling System
|
||||||
|
- **v0.1.0** (2025-12-XX) - Initial Release
|
||||||
@@ -1,15 +1,23 @@
|
|||||||
# SLMM - Sound Level Meter Manager
|
# SLMM - Sound Level Meter Manager
|
||||||
|
|
||||||
|
**Version 0.4.0**
|
||||||
|
|
||||||
Backend API service for controlling and monitoring Rion NL-43/NL-53 Sound Level Meters via TCP and FTP protocols.
|
Backend API service for controlling and monitoring Rion NL-43/NL-53 Sound Level Meters via TCP and FTP protocols.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
SLMM is a standalone backend module that provides REST API routing and command translation for NL43/NL53 sound level meters. This service acts as a bridge between the hardware devices and frontend applications, handling all device communication, data persistence, and protocol management.
|
SLMM is a standalone backend module that provides REST API routing and command translation for NL43/NL53 sound level meters. This service acts as a bridge between the hardware devices and frontend applications, handling all device communication, data persistence, and protocol management.
|
||||||
|
|
||||||
**Note:** This is a backend-only service. Actual user interfacing is done via [SFM/Terra-View](https://github.com/your-org/terra-view) frontend applications.
|
**Note:** This is a backend-only service. Actual user interfacing is done via customized front ends or cli.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
|
- **Live Monitor (fan-out)**: One shared cached live feed per device — many clients subscribe to the same stream instead of fighting over the meter's single TCP connection
|
||||||
|
- **Alert Engine**: Per-device threshold rules with onset/clear events, cooldowns, acks, and 24/7 evaluation
|
||||||
|
- **History & Percentiles**: Downsampled DOD trail + history endpoint for live-chart backfill; LN1/LN2 (L1/L10) percentiles surfaced through the feed
|
||||||
|
- **Persistent TCP Connections**: Cached per-device connections with OS-level keepalive, tuned for cellular modem reliability
|
||||||
|
- **Background Polling**: Continuous automatic polling of devices with configurable intervals
|
||||||
|
- **Offline Detection**: Automatic device reachability tracking with failure counters
|
||||||
- **Device Management**: Configure and manage multiple NL43/NL53 devices
|
- **Device Management**: Configure and manage multiple NL43/NL53 devices
|
||||||
- **Real-time Monitoring**: Stream live measurement data via WebSocket
|
- **Real-time Monitoring**: Stream live measurement data via WebSocket
|
||||||
- **Measurement Control**: Start, stop, pause, resume, and reset measurements
|
- **Measurement Control**: Start, stop, pause, resume, and reset measurements
|
||||||
@@ -18,22 +26,72 @@ SLMM is a standalone backend module that provides REST API routing and command t
|
|||||||
- **Device Configuration**: Manage frequency/time weighting, clock sync, and more
|
- **Device Configuration**: Manage frequency/time weighting, clock sync, and more
|
||||||
- **Rate Limiting**: Automatic 1-second delay enforcement between device commands
|
- **Rate Limiting**: Automatic 1-second delay enforcement between device commands
|
||||||
- **Persistent Storage**: SQLite database for device configs and measurement cache
|
- **Persistent Storage**: SQLite database for device configs and measurement cache
|
||||||
|
- **Connection Diagnostics**: Live UI and API endpoints for monitoring TCP connection pool status
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```
|
||||||
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
|
┌─────────────────┐ ┌──────────────────────────────┐ ┌─────────────────┐
|
||||||
│ Terra-View UI │◄───────►│ SLMM API │◄───────►│ NL43/NL53 │
|
│ │◄───────►│ SLMM API │◄───────►│ NL43/NL53 │
|
||||||
│ (Frontend) │ HTTP │ (Backend) │ TCP │ Sound Meters │
|
│ (Frontend) │ HTTP │ • REST Endpoints │ TCP │ Sound Meters │
|
||||||
└─────────────────┘ └──────────────┘ └─────────────────┘
|
└─────────────────┘ │ • WebSocket Streaming │ (kept │ (via cellular │
|
||||||
│
|
│ • Background Poller │ alive) │ modem) │
|
||||||
▼
|
│ • Connection Pool (v0.3) │ └─────────────────┘
|
||||||
┌──────────────┐
|
└──────────────────────────────┘
|
||||||
│ SQLite DB │
|
│
|
||||||
│ (Cache) │
|
▼
|
||||||
└──────────────┘
|
┌──────────────┐
|
||||||
|
│ SQLite DB │
|
||||||
|
│ • Config │
|
||||||
|
│ • Status │
|
||||||
|
└──────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Live Monitor — Fan-Out Feed (v0.4.0)
|
||||||
|
|
||||||
|
The NL-43 allows only one TCP control connection at a time, so multiple clients
|
||||||
|
polling the same device directly would contend for it. The monitor solves this
|
||||||
|
with a single shared, cached feed per device:
|
||||||
|
|
||||||
|
- **One reader, many subscribers**: a single poller reads the device; every
|
||||||
|
WebSocket subscriber (`WS /api/nl43/{unit_id}/monitor`) receives the same
|
||||||
|
frames — an instant first frame from cache, then live updates.
|
||||||
|
- **Persistent + auto-start**: a `monitor_enabled` flag keeps the feed running
|
||||||
|
and auto-starts it on boot. Enabled alert rules pin the monitor on for 24/7
|
||||||
|
evaluation even with no UI connected.
|
||||||
|
- **Adaptive & deduplicated**: poll rate adapts to demand, unreachable devices
|
||||||
|
back off, and the background poller skips units already covered by a monitor.
|
||||||
|
|
||||||
|
### Alert Engine (v0.4.0)
|
||||||
|
|
||||||
|
Per-device threshold alerting evaluated against the live feed:
|
||||||
|
|
||||||
|
- **Rules**: metric + threshold + `cooldown_s`, full CRUD per device
|
||||||
|
- **Events**: onset/clear state machine, acknowledgement, and a device-offline
|
||||||
|
alert when a monitored unit drops
|
||||||
|
- **Robust**: editing/deleting a rule resets its state and closes open events
|
||||||
|
|
||||||
|
### Persistent TCP Connection Pool (v0.3.0)
|
||||||
|
|
||||||
|
SLMM maintains persistent TCP connections to devices with OS-level keepalive, designed for reliable operation over cellular modems:
|
||||||
|
|
||||||
|
- **Connection Reuse**: One cached TCP socket per device, reused across all commands (no repeated handshakes)
|
||||||
|
- **TCP Keepalive**: Probes keep cellular NAT tables alive and detect dead connections early
|
||||||
|
- **Transparent Retry**: Stale cached connections automatically retry with a fresh socket
|
||||||
|
- **Configurable**: Idle TTL (300s), max age (1800s), and keepalive timing via environment variables
|
||||||
|
- **Diagnostics**: Live UI on the roster page and API endpoints for monitoring pool status
|
||||||
|
|
||||||
|
### Background Polling (v0.2.0)
|
||||||
|
|
||||||
|
Background polling service continuously queries devices and updates the status cache:
|
||||||
|
|
||||||
|
- **Automatic Updates**: Devices are polled at configurable intervals (10-3600 seconds)
|
||||||
|
- **Offline Detection**: Devices marked unreachable after 3 consecutive failures
|
||||||
|
- **Per-Device Configuration**: Each device can have a custom polling interval
|
||||||
|
- **Resource Efficient**: Dynamic sleep intervals and smart scheduling
|
||||||
|
|
||||||
|
Status requests return cached data instantly (<100ms) instead of waiting for device queries (1-2 seconds).
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
@@ -77,9 +135,18 @@ Once running, visit:
|
|||||||
|
|
||||||
### Environment Variables
|
### Environment Variables
|
||||||
|
|
||||||
|
**Server:**
|
||||||
- `PORT`: Server port (default: 8100)
|
- `PORT`: Server port (default: 8100)
|
||||||
- `CORS_ORIGINS`: Comma-separated list of allowed origins (default: "*")
|
- `CORS_ORIGINS`: Comma-separated list of allowed origins (default: "*")
|
||||||
|
|
||||||
|
**TCP Connection Pool:**
|
||||||
|
- `TCP_PERSISTENT_ENABLED`: Enable persistent connections (default: "true")
|
||||||
|
- `TCP_IDLE_TTL`: Close idle connections after N seconds (default: 300)
|
||||||
|
- `TCP_MAX_AGE`: Force reconnect after N seconds (default: 1800)
|
||||||
|
- `TCP_KEEPALIVE_IDLE`: Seconds idle before keepalive probes (default: 15)
|
||||||
|
- `TCP_KEEPALIVE_INTERVAL`: Seconds between keepalive probes (default: 10)
|
||||||
|
- `TCP_KEEPALIVE_COUNT`: Failed probes before declaring dead (default: 3)
|
||||||
|
|
||||||
### Database
|
### Database
|
||||||
|
|
||||||
The SQLite database is automatically created at [data/slmm.db](data/slmm.db) on first run.
|
The SQLite database is automatically created at [data/slmm.db](data/slmm.db) on first run.
|
||||||
@@ -103,10 +170,49 @@ Logs are written to:
|
|||||||
|
|
||||||
| Method | Endpoint | Description |
|
| Method | Endpoint | Description |
|
||||||
|--------|----------|-------------|
|
|--------|----------|-------------|
|
||||||
| GET | `/api/nl43/{unit_id}/status` | Get cached measurement snapshot |
|
| GET | `/api/nl43/{unit_id}/status` | Get cached measurement snapshot (updated by background poller) |
|
||||||
| GET | `/api/nl43/{unit_id}/live` | Request fresh DOD data from device |
|
| GET | `/api/nl43/{unit_id}/live` | Request fresh DOD data from device (bypasses cache) |
|
||||||
|
| GET | `/api/nl43/{unit_id}/history` | Downsampled DOD trail for live-chart backfill |
|
||||||
| WS | `/api/nl43/{unit_id}/stream` | WebSocket stream for real-time DRD data |
|
| WS | `/api/nl43/{unit_id}/stream` | WebSocket stream for real-time DRD data |
|
||||||
|
|
||||||
|
### Live Monitor (fan-out feed)
|
||||||
|
|
||||||
|
| Method | Endpoint | Description |
|
||||||
|
|--------|----------|-------------|
|
||||||
|
| WS | `/api/nl43/{unit_id}/monitor` | Subscribe to the shared cached live feed (instant first frame) |
|
||||||
|
| POST | `/api/nl43/{unit_id}/monitor/start` | Start the device's monitor feed |
|
||||||
|
| POST | `/api/nl43/{unit_id}/monitor/stop` | Stop the device's monitor feed |
|
||||||
|
| GET | `/api/nl43/_monitor/status` | Global monitor status across devices |
|
||||||
|
| POST | `/api/nl43/{unit_id}/disconnect` | Drop the device's pooled TCP connection |
|
||||||
|
| POST | `/api/nl43/{unit_id}/deactivate` | Quiesce polling/monitoring for one device |
|
||||||
|
| POST | `/api/nl43/_system/standby` | Global standby — quiesce all polling/monitoring |
|
||||||
|
|
||||||
|
### Alerts
|
||||||
|
|
||||||
|
| Method | Endpoint | Description |
|
||||||
|
|--------|----------|-------------|
|
||||||
|
| GET | `/api/nl43/{unit_id}/alerts/rules` | List alert rules for a device |
|
||||||
|
| POST | `/api/nl43/{unit_id}/alerts/rules` | Create an alert rule (metric, threshold, cooldown) |
|
||||||
|
| PUT | `/api/nl43/{unit_id}/alerts/rules/{rule_id}` | Update a rule (resets its state, closes open events) |
|
||||||
|
| DELETE | `/api/nl43/{unit_id}/alerts/rules/{rule_id}` | Delete a rule |
|
||||||
|
| GET | `/api/nl43/{unit_id}/alerts/events` | List alert events (onset/clear) |
|
||||||
|
| POST | `/api/nl43/{unit_id}/alerts/events/{event_id}/ack` | Acknowledge an event |
|
||||||
|
|
||||||
|
### Background Polling
|
||||||
|
|
||||||
|
| Method | Endpoint | Description |
|
||||||
|
|--------|----------|-------------|
|
||||||
|
| GET | `/api/nl43/{unit_id}/polling/config` | Get device polling configuration |
|
||||||
|
| PUT | `/api/nl43/{unit_id}/polling/config` | Update polling interval and enable/disable polling |
|
||||||
|
| GET | `/api/nl43/_polling/status` | Get global polling status for all devices |
|
||||||
|
|
||||||
|
### Connection Pool
|
||||||
|
|
||||||
|
| Method | Endpoint | Description |
|
||||||
|
|--------|----------|-------------|
|
||||||
|
| GET | `/api/nl43/_connections/status` | Get pool config, active connections, age/idle times |
|
||||||
|
| POST | `/api/nl43/_connections/flush` | Force-close all cached TCP connections |
|
||||||
|
|
||||||
### Measurement Control
|
### Measurement Control
|
||||||
|
|
||||||
| Method | Endpoint | Description |
|
| Method | Endpoint | Description |
|
||||||
@@ -167,6 +273,7 @@ slmm/
|
|||||||
│ ├── routers.py # API route definitions
|
│ ├── routers.py # API route definitions
|
||||||
│ ├── models.py # SQLAlchemy database models
|
│ ├── models.py # SQLAlchemy database models
|
||||||
│ ├── services.py # NL43Client and business logic
|
│ ├── services.py # NL43Client and business logic
|
||||||
|
│ ├── background_poller.py # Background polling service ⭐ NEW
|
||||||
│ └── database.py # Database configuration
|
│ └── database.py # Database configuration
|
||||||
├── data/
|
├── data/
|
||||||
│ ├── slmm.db # SQLite database (auto-created)
|
│ ├── slmm.db # SQLite database (auto-created)
|
||||||
@@ -175,9 +282,12 @@ slmm/
|
|||||||
├── templates/
|
├── templates/
|
||||||
│ └── index.html # Simple web interface (optional)
|
│ └── index.html # Simple web interface (optional)
|
||||||
├── manuals/ # Device documentation
|
├── manuals/ # Device documentation
|
||||||
|
├── migrate_add_polling_fields.py # Database migration for v0.2.0 ⭐ NEW
|
||||||
|
├── test_polling.sh # Polling feature test script ⭐ NEW
|
||||||
├── API.md # Detailed API documentation
|
├── API.md # Detailed API documentation
|
||||||
├── COMMUNICATION_GUIDE.md # NL43 protocol documentation
|
├── COMMUNICATION_GUIDE.md # NL43 protocol documentation
|
||||||
├── NL43_COMMANDS.md # Command reference
|
├── NL43_COMMANDS.md # Command reference
|
||||||
|
├── CHANGELOG.md # Version history ⭐ NEW
|
||||||
├── requirements.txt # Python dependencies
|
├── requirements.txt # Python dependencies
|
||||||
└── README.md # This file
|
└── README.md # This file
|
||||||
```
|
```
|
||||||
@@ -194,12 +304,16 @@ Stores device connection configuration:
|
|||||||
- `ftp_username`: FTP authentication username
|
- `ftp_username`: FTP authentication username
|
||||||
- `ftp_password`: FTP authentication password
|
- `ftp_password`: FTP authentication password
|
||||||
- `web_enabled`: Enable/disable web interface access
|
- `web_enabled`: Enable/disable web interface access
|
||||||
|
- `poll_interval_seconds`: Polling interval in seconds (10-3600, default: 60) ⭐ NEW
|
||||||
|
- `poll_enabled`: Enable/disable background polling for this device ⭐ NEW
|
||||||
|
|
||||||
### NL43Status Table
|
### NL43Status Table
|
||||||
Caches latest measurement snapshot:
|
Caches latest measurement snapshot:
|
||||||
- `unit_id` (PK): Unique device identifier
|
- `unit_id` (PK): Unique device identifier
|
||||||
- `last_seen`: Timestamp of last update
|
- `last_seen`: Timestamp of last update
|
||||||
- `measurement_state`: Current state (Measure/Stop)
|
- `measurement_state`: Current state (Measure/Stop)
|
||||||
|
- `measurement_start_time`: When measurement started (UTC)
|
||||||
|
- `counter`: Measurement interval counter (1-600)
|
||||||
- `lp`: Instantaneous sound pressure level
|
- `lp`: Instantaneous sound pressure level
|
||||||
- `leq`: Equivalent continuous sound level
|
- `leq`: Equivalent continuous sound level
|
||||||
- `lmax`: Maximum sound level
|
- `lmax`: Maximum sound level
|
||||||
@@ -210,11 +324,43 @@ Caches latest measurement snapshot:
|
|||||||
- `sd_remaining_mb`: Free SD card space (MB)
|
- `sd_remaining_mb`: Free SD card space (MB)
|
||||||
- `sd_free_ratio`: SD card free space ratio
|
- `sd_free_ratio`: SD card free space ratio
|
||||||
- `raw_payload`: Raw device response data
|
- `raw_payload`: Raw device response data
|
||||||
|
- `is_reachable`: Device reachability status (Boolean)
|
||||||
|
- `consecutive_failures`: Count of consecutive poll failures
|
||||||
|
- `last_poll_attempt`: Last time background poller attempted to poll
|
||||||
|
- `last_success`: Last successful poll timestamp
|
||||||
|
- `last_error`: Last error message (truncated to 500 chars)
|
||||||
|
- `ln1` / `ln2`: LN1/LN2 (L1/L10) percentile levels ⭐ v0.4.0
|
||||||
|
|
||||||
|
### NL43Readings Table ⭐ v0.4.0
|
||||||
|
Downsampled DOD trail backing the live-chart history endpoint (one row/minute,
|
||||||
|
pruned to a retention window — viewing only, not the report source):
|
||||||
|
- `id` (PK), `unit_id`, `timestamp`
|
||||||
|
- `lp` / `leq` / `lmax` / `ln1` / `ln2`: cached level samples
|
||||||
|
|
||||||
|
### AlertRule Table ⭐ v0.4.0
|
||||||
|
Per-device threshold alert rules:
|
||||||
|
- `id` (PK), `unit_id`, `name`, `enabled`
|
||||||
|
- `metric`, `comparison` (above/below), `threshold_db`, `clear_margin_db` (hysteresis)
|
||||||
|
- `duration_s` (sustained), `cooldown_s` (min seconds between onsets)
|
||||||
|
- `channels` / `recipients`, optional `schedule_start`/`schedule_end`/`schedule_days`
|
||||||
|
|
||||||
|
### AlertEvent Table ⭐ v0.4.0
|
||||||
|
Alert onset/clear events for history, inbox, and acknowledgement:
|
||||||
|
- `id` (PK), `unit_id`, `rule_id`, `rule_name`, `metric`, `threshold_db`
|
||||||
|
- `onset_at` / `onset_value`, `peak_value`, `clear_at`, `status` (active/cleared)
|
||||||
|
- `acknowledged_at` / `acknowledged_by`, `notes`
|
||||||
|
|
||||||
|
> New tables (`alert_rules`, `alert_events`, `nl43_readings`) auto-create on
|
||||||
|
> startup. Existing-table columns ship with migrations:
|
||||||
|
> `migrate_add_ln_percentiles.py`, `migrate_add_monitor_enabled.py`.
|
||||||
|
|
||||||
## Protocol Details
|
## Protocol Details
|
||||||
|
|
||||||
### TCP Communication
|
### TCP Communication
|
||||||
- Uses ASCII command protocol over TCP
|
- Uses ASCII command protocol over TCP
|
||||||
|
- Persistent connections with OS-level keepalive (tuned for cellular modems)
|
||||||
|
- Connections cached per device and reused across commands
|
||||||
|
- Transparent retry on stale connections
|
||||||
- Enforces ≥1 second delay between commands to same device
|
- Enforces ≥1 second delay between commands to same device
|
||||||
- Two-line response format:
|
- Two-line response format:
|
||||||
- Line 1: Result code (R+0000 for success)
|
- Line 1: Result code (R+0000 for success)
|
||||||
@@ -253,11 +399,43 @@ curl -X PUT http://localhost:8100/api/nl43/meter-001/config \
|
|||||||
curl -X POST http://localhost:8100/api/nl43/meter-001/start
|
curl -X POST http://localhost:8100/api/nl43/meter-001/start
|
||||||
```
|
```
|
||||||
|
|
||||||
### Get Live Status
|
### Get Cached Status (Fast - from background poller)
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8100/api/nl43/meter-001/status
|
||||||
|
```
|
||||||
|
|
||||||
|
### Get Live Status (Bypasses cache)
|
||||||
```bash
|
```bash
|
||||||
curl http://localhost:8100/api/nl43/meter-001/live
|
curl http://localhost:8100/api/nl43/meter-001/live
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Configure Background Polling ⭐ NEW
|
||||||
|
```bash
|
||||||
|
# Set polling interval to 30 seconds
|
||||||
|
curl -X PUT http://localhost:8100/api/nl43/meter-001/polling/config \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"poll_interval_seconds": 30,
|
||||||
|
"poll_enabled": true
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Get polling configuration
|
||||||
|
curl http://localhost:8100/api/nl43/meter-001/polling/config
|
||||||
|
|
||||||
|
# Check global polling status
|
||||||
|
curl http://localhost:8100/api/nl43/_polling/status
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check Connection Pool Status
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8100/api/nl43/_connections/status | jq '.'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Flush All Cached Connections
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8100/api/nl43/_connections/flush
|
||||||
|
```
|
||||||
|
|
||||||
### Verify Device Settings
|
### Verify Device Settings
|
||||||
```bash
|
```bash
|
||||||
curl http://localhost:8100/api/nl43/meter-001/settings
|
curl http://localhost:8100/api/nl43/meter-001/settings
|
||||||
@@ -326,11 +504,19 @@ See [API.md](API.md) for detailed integration examples.
|
|||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
### Connection Issues
|
### Connection Issues
|
||||||
|
- Check connection pool status: `curl http://localhost:8100/api/nl43/_connections/status`
|
||||||
|
- Flush stale connections: `curl -X POST http://localhost:8100/api/nl43/_connections/flush`
|
||||||
- Verify device IP address and port in configuration
|
- Verify device IP address and port in configuration
|
||||||
- Ensure device is on the same network
|
- Ensure device is on the same network
|
||||||
- Check firewall rules allow TCP/FTP connections
|
- Check firewall rules allow TCP/FTP connections
|
||||||
- Verify RX55 network adapter is properly configured on device
|
- Verify RX55 network adapter is properly configured on device
|
||||||
|
|
||||||
|
### Cellular Modem Issues
|
||||||
|
- If modem wedges from too many handshakes, ensure `TCP_PERSISTENT_ENABLED=true` (default)
|
||||||
|
- Increase `TCP_IDLE_TTL` if connections expire between poll cycles
|
||||||
|
- Keepalive probes (default: every 15s) keep NAT tables alive — adjust `TCP_KEEPALIVE_IDLE` if needed
|
||||||
|
- Set `TCP_PERSISTENT_ENABLED=false` to disable pooling for debugging
|
||||||
|
|
||||||
### Rate Limiting
|
### Rate Limiting
|
||||||
- API automatically enforces 1-second delay between commands
|
- API automatically enforces 1-second delay between commands
|
||||||
- If experiencing delays, this is normal device behavior
|
- If experiencing delays, this is normal device behavior
|
||||||
@@ -356,13 +542,31 @@ pytest
|
|||||||
|
|
||||||
### Database Migrations
|
### Database Migrations
|
||||||
```bash
|
```bash
|
||||||
# Migrate existing database to add FTP credentials
|
# Migrate to v0.2.0 (add background polling fields)
|
||||||
|
python3 migrate_add_polling_fields.py
|
||||||
|
|
||||||
|
# Legacy: Migrate to add FTP credentials
|
||||||
python migrate_add_ftp_credentials.py
|
python migrate_add_ftp_credentials.py
|
||||||
|
|
||||||
# Set FTP credentials for a device
|
# Set FTP credentials for a device
|
||||||
python set_ftp_credentials.py <unit_id> <username> <password>
|
python set_ftp_credentials.py <unit_id> <username> <password>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Testing Background Polling
|
||||||
|
```bash
|
||||||
|
# Run comprehensive polling tests
|
||||||
|
./test_polling.sh [unit_id]
|
||||||
|
|
||||||
|
# Test settings endpoint
|
||||||
|
python3 test_settings_endpoint.py <unit_id>
|
||||||
|
|
||||||
|
# Test sleep mode auto-disable
|
||||||
|
python3 test_sleep_mode_auto_disable.py <unit_id>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Legacy Scripts
|
||||||
|
Old migration scripts and manual polling tools have been moved to `archive/` for reference. See [archive/README.md](archive/README.md) for details.
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
This is a standalone module kept separate from the SFM/Terra-View codebase. When contributing:
|
This is a standalone module kept separate from the SFM/Terra-View codebase. When contributing:
|
||||||
|
|||||||
@@ -0,0 +1,403 @@
|
|||||||
|
# NL-43 + RX55 TCP “Wedge” Investigation (2255 Refusal) — Full Log & Next Steps
|
||||||
|
**Last updated:** 2026-02-18
|
||||||
|
**Owner:** Brian / serversdown
|
||||||
|
**Context:** Terra-View / SLMM / field-deployed Rion NL-43 behind Sierra Wireless RX55
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 0) What this document is
|
||||||
|
This is a **comprehensive, chronological** record of the debugging we did to isolate a failure where the **NL-43’s TCP control port (2255) eventually stops accepting connections** (“wedges”), while other services (notably FTP/21) remain reachable.
|
||||||
|
|
||||||
|
This is written to be fed back into future troubleshooting, so it intentionally includes the **full reasoning chain, experiments, commands, packet evidence, and conclusions**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1) Architecture (as tested)
|
||||||
|
### Network path
|
||||||
|
- **Server (SLMM host):** `10.0.0.40`
|
||||||
|
- **RX55 WAN IP:** `63.45.161.30`
|
||||||
|
- **RX55 LAN subnet:** `192.168.1.0/24`
|
||||||
|
- **RX55 LAN gateway:** `192.168.1.1`
|
||||||
|
- **NL-43 LAN IP:** `192.168.1.10` (confirmed via ARP OUI + ping; see LAN validation)
|
||||||
|
|
||||||
|
### RX55 details
|
||||||
|
- **Sierra Wireless RX55**
|
||||||
|
- **OS:** 5.2
|
||||||
|
- **Firmware:** `01.14.24.00`
|
||||||
|
- **Carrier:** Verizon LTE (Band 66)
|
||||||
|
|
||||||
|
### Port forwarding rules (RX55)
|
||||||
|
- **WAN:2255 → NL-43:2255** (NL-43 TCP control)
|
||||||
|
- **WAN:21 → NL-43:21** (NL-43 FTP control)
|
||||||
|
|
||||||
|
You also experimented with additional forwards:
|
||||||
|
- **WAN:2253 → NL-43:2255** (test)
|
||||||
|
- **WAN:2253 → NL-43:2253** (test)
|
||||||
|
- **WAN:4450 → NL-43:4450** (test)
|
||||||
|
|
||||||
|
**Important:** Rule “Input zone / interface” was set to **WAN-NAT**, and Source IP left as **Any IPv4**. This is correct for inbound port-forward behavior on Sierra OS 5.x.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2) Original problem statement (the “wedge”)
|
||||||
|
After running for hours, the NL-43 becomes unreachable over TCP control.
|
||||||
|
|
||||||
|
### Symptom signature (WAN-side)
|
||||||
|
- Client attempts to connect to `63.45.161.30:2255`
|
||||||
|
- Instead of timing out, the client gets **connection refused** quickly.
|
||||||
|
- Packet-level: SYN from client → **RST,ACK** back (meaning active refusal vs silent drop)
|
||||||
|
|
||||||
|
### Critical operational behavior
|
||||||
|
- **Power cycling the NL-43 fixes it.**
|
||||||
|
- **Power cycling the RX55 does NOT fix it.**
|
||||||
|
- FTP sometimes remains available even while TCP control (2255) is dead.
|
||||||
|
|
||||||
|
This combination is what forced us to determine whether:
|
||||||
|
- The RX55 is rejecting connections, OR
|
||||||
|
- The NL-43 is no longer listening on 2255, OR
|
||||||
|
- Something about the RX55 path triggers the NL-43’s control listener to die.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3) Event timeline evidence (SLMM logs)
|
||||||
|
A concrete wedge window was observed on **2026-02-18**:
|
||||||
|
|
||||||
|
- 10:55:46 AM — Poll success (Start)
|
||||||
|
- 11:00:28 AM — Measurement STOPPED (scheduled stop/download cycle succeeded)
|
||||||
|
- 11:55:50 AM — Poll success (Stop)
|
||||||
|
- 12:55:55 PM — Poll success (Stop)
|
||||||
|
- **1:55:58 PM — Poll failed (attempt 1/3): Errno 111 (connection refused)**
|
||||||
|
- 2:56:02 PM — Poll failed (attempt 2/3): Errno 111 (connection refused)
|
||||||
|
|
||||||
|
Key interpretation:
|
||||||
|
- The wedge occurred sometime between **12:55 and 1:55**.
|
||||||
|
- The failure type is **refused**, not timeout.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4) Early hypotheses (before proof)
|
||||||
|
We considered two main buckets:
|
||||||
|
|
||||||
|
### A) NL-43-side failure (most suspicious)
|
||||||
|
- NL-43 TCP control service crashes / exits / unbinds from 2255
|
||||||
|
- socket leak / accept backlog exhaustion
|
||||||
|
- “single control session allowed” and it gets stuck thinking a session is active
|
||||||
|
- mode/service manager bug (service restart fails after other activities)
|
||||||
|
- firmware bug in TCP daemon
|
||||||
|
|
||||||
|
### B) RX55-side failure (possible trigger / less likely once FTP works)
|
||||||
|
- NAT/forwarding table corruption
|
||||||
|
- firewall behavior
|
||||||
|
- helper/ALG interference
|
||||||
|
- MSS/MTU weirdness causing edge-case behavior
|
||||||
|
- session churn behavior causing downstream issues
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5) Key experiments and what they proved
|
||||||
|
|
||||||
|
### 5.1) LAN-only stability test (No RX55 path)
|
||||||
|
**Test:** NL-43 tested directly on LAN (no modem path involved).
|
||||||
|
- Ran **24+ hours**
|
||||||
|
- Scheduler start/stop cycles worked
|
||||||
|
- Stress test: **500 commands @ 1/sec** → no failure
|
||||||
|
- Response time trend decreased (not degrading)
|
||||||
|
|
||||||
|
**Result:** The NL-43 appears stable in a “pure LAN” environment.
|
||||||
|
|
||||||
|
**Interpretation:** The trigger is likely related to the RX55/WAN environment, connection patterns, or service switching patterns—not just simple uptime.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5.2) Port-forward behavior: timeout vs refused (RX55 behavior characterization)
|
||||||
|
You observed:
|
||||||
|
|
||||||
|
- **If a WAN port is NOT forwarded (no rule):** connecting to that port **times out** (silent drop)
|
||||||
|
- **If a WAN port IS forwarded to NL-43 but nothing listens:** it **actively refuses** (RST)
|
||||||
|
|
||||||
|
Concrete example:
|
||||||
|
- Port **4450** with no rule → timeout
|
||||||
|
- Port **4450 → NL-43:4450** rule created → connection refused
|
||||||
|
|
||||||
|
**Interpretation:** This confirms the RX55 is actually forwarding packets to the NL-43 when a rule exists. “Refused” is consistent with the NL-43 (or RX55 relay behavior) responding quickly because the packet reached the target.
|
||||||
|
|
||||||
|
Important nuance:
|
||||||
|
- A “refused” on forwarded ports does **not** automatically prove the NL-43 is the one generating RST, because NAT hides the inside host and the RX55 could reject on behalf of an unreachable target. We needed a LAN-side proof test to close the loop.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5.3) UDP test confusion (and resolution)
|
||||||
|
You ran:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nc -vzu 63.45.161.30 2255
|
||||||
|
nc -vz 63.45.161.30 2255
|
||||||
|
```
|
||||||
|
|
||||||
|
Observed:
|
||||||
|
- UDP: “succeeded”
|
||||||
|
- TCP: “connection refused”
|
||||||
|
|
||||||
|
Resolution:
|
||||||
|
- UDP has **no handshake**. netcat prints “succeeded” if it doesn’t immediately receive an ICMP unreachable. It does **not** mean a UDP service exists.
|
||||||
|
- TCP refused is meaningful: a RST implies “no listener” or “actively rejected.”
|
||||||
|
|
||||||
|
**Net effect:** UDP test did not change the diagnosis.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5.4) Packet capture proof (WAN-side)
|
||||||
|
You captured a Wireshark/tcpdump summary with these key patterns:
|
||||||
|
|
||||||
|
#### Port 2255 (TCP control)
|
||||||
|
Example:
|
||||||
|
- `10.0.0.40 → 63.45.161.30:2255` SYN
|
||||||
|
- `63.45.161.30 → 10.0.0.40` **RST, ACK** within ~50ms
|
||||||
|
|
||||||
|
This happened repeatedly.
|
||||||
|
|
||||||
|
#### Port 2253 (test port)
|
||||||
|
Multiple SYN attempts to 2253 showed **retransmissions and no response**, i.e., **silent drop** (consistent with no rule or not forwarded at that moment).
|
||||||
|
|
||||||
|
#### Port 21 (FTP)
|
||||||
|
Clean 3-way handshake:
|
||||||
|
- SYN → SYN/ACK → ACK
|
||||||
|
Then:
|
||||||
|
- FTP server banner: `220 Connection Ready`
|
||||||
|
Then:
|
||||||
|
- `530 Not logged in` (because SLMM was sending non-FTP “requests” as an experiment)
|
||||||
|
Session closes cleanly.
|
||||||
|
|
||||||
|
**Key takeaway from capture:**
|
||||||
|
- TCP transport to NL-43 via RX55 is definitely working (port 21 proves it).
|
||||||
|
- Port 2255 is being actively refused.
|
||||||
|
|
||||||
|
This strongly suggested “2255 listener is gone,” but still didn’t fully prove whether the refusal was generated internally by NL-43 or by RX55 on behalf of NL-43.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6) The decisive experiment: LAN-side test while wedged (final proof)
|
||||||
|
Because the RX55 does not offer SSH, the plan was to test from **inside the LAN behind the RX55**.
|
||||||
|
|
||||||
|
### 6.1) Physical LAN tap setup
|
||||||
|
Constraint:
|
||||||
|
- NL-43 has only one Ethernet port.
|
||||||
|
|
||||||
|
Solution:
|
||||||
|
- Insert an unmanaged switch:
|
||||||
|
- RX55 LAN → switch
|
||||||
|
- NL-43 → switch
|
||||||
|
- Windows 10 laptop → switch
|
||||||
|
|
||||||
|
This creates a shared L2 segment where the laptop can test NL-43 directly.
|
||||||
|
|
||||||
|
### 6.2) Windows LAN validation
|
||||||
|
On the Windows laptop:
|
||||||
|
|
||||||
|
- `ipconfig` showed:
|
||||||
|
- IP: `192.168.1.100`
|
||||||
|
- Gateway: `192.168.1.1` (RX55)
|
||||||
|
- Initial `arp -a` only showed RX55, not NL-43.
|
||||||
|
|
||||||
|
You then:
|
||||||
|
- pinged likely host addresses and discovered NL-43 responds on **192.168.1.10**
|
||||||
|
- `arp -a` then showed:
|
||||||
|
- `192.168.1.10 → 00-10-50-14-0a-d8`
|
||||||
|
- OUI `00-10-50` recognized as **Rion** (matches NL-43)
|
||||||
|
|
||||||
|
So LAN identities were confirmed:
|
||||||
|
- RX55: `192.168.1.1`
|
||||||
|
- NL-43: `192.168.1.10`
|
||||||
|
|
||||||
|
### 6.3) The LAN port tests (the smoking gun)
|
||||||
|
From Windows:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
Test-NetConnection -ComputerName 192.168.1.10 -Port 2255
|
||||||
|
Test-NetConnection -ComputerName 192.168.1.10 -Port 21
|
||||||
|
```
|
||||||
|
|
||||||
|
Results (while the unit was “wedged” from the WAN perspective):
|
||||||
|
- **2255:** `TcpTestSucceeded : False`
|
||||||
|
- **21:** `TcpTestSucceeded : True`
|
||||||
|
|
||||||
|
**Conclusion (PROVEN):**
|
||||||
|
- The NL-43 is reachable on the LAN
|
||||||
|
- FTP port 21 is alive
|
||||||
|
- **The NL-43 is NOT listening on TCP port 2255**
|
||||||
|
- Therefore the RX55 is not the root cause of the refusal. The WAN refusal is consistent with the NL-43 having no listener on 2255.
|
||||||
|
|
||||||
|
This is now settled.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7) What we learned (final conclusions)
|
||||||
|
### 7.1) RX55 innocence (for this failure mode)
|
||||||
|
The RX55 is not “randomly rejecting” or “breaking TCP” in the way originally feared.
|
||||||
|
|
||||||
|
It successfully forwards and supports TCP to the NL-43 on port 21, and the LAN-side test proves the 2255 failure exists *even without NAT/WAN involvement*.
|
||||||
|
|
||||||
|
### 7.2) NL-43 control listener failure
|
||||||
|
The NL-43’s TCP control service (port 2255) stops listening while:
|
||||||
|
- the device remains alive
|
||||||
|
- the LAN stack remains alive (ping)
|
||||||
|
- FTP remains alive (port 21)
|
||||||
|
|
||||||
|
This looks like one of:
|
||||||
|
- control daemon crash/exit
|
||||||
|
- service unbind
|
||||||
|
- stuck service state (e.g., “busy” / “session active forever”)
|
||||||
|
- resource leak (sockets/file descriptors) specific to the control service
|
||||||
|
- firmware service manager bug (start/stop of services fails after certain sequences)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8) Additional constraint discovered: “Web App mode” conflicts
|
||||||
|
You noted an important operational constraint:
|
||||||
|
|
||||||
|
> Turning on the web app disables other interfaces like TCP and FTP.
|
||||||
|
|
||||||
|
Meaning the NL-43 appears to have mutually exclusive service/mode behavior (or at least serious conflicts). That matters because:
|
||||||
|
- If any workflow toggles modes (explicitly or implicitly), it could destabilize the service lifecycle.
|
||||||
|
- It reduces the possibility of using “web UI toggle” as an easy remote recovery mechanism **if** it disables the services needed.
|
||||||
|
|
||||||
|
We have not yet run a controlled long test to determine whether:
|
||||||
|
- mode switching contributes directly to the 2255 listener dying, OR
|
||||||
|
- it happens even in a pure TCP-only mode with no switching.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9) Immediate operational decision (field tomorrow)
|
||||||
|
Because the device is needed in the field immediately, you chose:
|
||||||
|
- **Old-school manual deployment**
|
||||||
|
- **Manual SD card downloads**
|
||||||
|
- Avoid reliance on 2255/TCP control and remote workflows for now.
|
||||||
|
|
||||||
|
**Important operational note:**
|
||||||
|
The 2255 listener dying does not necessarily stop the NL-43 from measuring; it primarily breaks remote control/polling. Manual SD workflow sidesteps the entire remote control dependency.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10) What’s next (future work — when the unit is back)
|
||||||
|
Because long tests can’t be run before tomorrow, the plan is to resume in a few weeks with controlled experiments designed to isolate the trigger and develop an operational mitigation.
|
||||||
|
|
||||||
|
### 10.1) Controlled experiment matrix (recommended)
|
||||||
|
Run each test for 24–72 hours, or until wedge occurs, and record:
|
||||||
|
- number of TCP connects
|
||||||
|
- whether connections are persistent
|
||||||
|
- whether FTP is used
|
||||||
|
- whether any mode toggling is performed
|
||||||
|
- time-to-wedge
|
||||||
|
|
||||||
|
#### Test A — TCP-only (ideal baseline)
|
||||||
|
- TCP control only (2255)
|
||||||
|
- **True persistent connection** (open once, keep forever)
|
||||||
|
- No FTP
|
||||||
|
- No web mode toggling
|
||||||
|
|
||||||
|
Outcome interpretation:
|
||||||
|
- If stable: connection churn and/or FTP/mode switching is the trigger.
|
||||||
|
- If wedges anyway: pure 2255 daemon leak/bug.
|
||||||
|
|
||||||
|
#### Test B — TCP with connection churn
|
||||||
|
- Same as A but intentionally reconnect on a schedule (current SLMM behavior)
|
||||||
|
- No FTP
|
||||||
|
|
||||||
|
Outcome:
|
||||||
|
- If this wedges but A doesn’t: churn is the trigger.
|
||||||
|
|
||||||
|
#### Test C — FTP activity + TCP
|
||||||
|
- Introduce scheduled FTP sessions (downloads) while using TCP control
|
||||||
|
- Observe whether wedge correlates with FTP use or with post-download periods.
|
||||||
|
|
||||||
|
Outcome:
|
||||||
|
- If wedge correlates with FTP, suspect internal service lifecycle conflict.
|
||||||
|
|
||||||
|
#### Test D — Web mode interaction (only if safe/possible)
|
||||||
|
- Evaluate what toggling web mode does to TCP/FTP services.
|
||||||
|
- Determine if any remote-safe “soft reset” exists.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11) Mitigation options (ranked)
|
||||||
|
### Option 1 — Make SLMM truly persistent (highest probability of success)
|
||||||
|
If the NL-43 wedges due to session churn or leaked socket states, the best mitigation is:
|
||||||
|
- Open one TCP socket per device
|
||||||
|
- Keep it open indefinitely
|
||||||
|
- Use OS keepalive
|
||||||
|
- Do **not** rotate connections on timers
|
||||||
|
- Reconnect only when the socket actually dies
|
||||||
|
|
||||||
|
This reduces:
|
||||||
|
- connect/close cycles
|
||||||
|
- NAT edge-case exposure
|
||||||
|
- resource churn inside NL-43
|
||||||
|
|
||||||
|
### Option 2 — Service “soft reset” (if possible without disabling required services)
|
||||||
|
If there exists any way to restart the 2255 service without power cycling:
|
||||||
|
- LAN TCP toggle (if it doesn’t require web mode)
|
||||||
|
- any “restart comms” command (unknown)
|
||||||
|
- any maintenance menu sequence
|
||||||
|
then SLMM could:
|
||||||
|
- detect wedge
|
||||||
|
- trigger soft reset
|
||||||
|
- recover automatically
|
||||||
|
|
||||||
|
Current constraint: web app mode appears to disable other services, so this may not be viable.
|
||||||
|
|
||||||
|
### Option 3 — Hardware watchdog power cycle (industrial but reliable)
|
||||||
|
If this is a firmware bug with no clean workaround:
|
||||||
|
- Add a remotely controlled relay/power switch
|
||||||
|
- On wedge detection, power-cycle NL-43 automatically
|
||||||
|
- Optionally schedule a nightly power cycle to prevent leak accumulation
|
||||||
|
|
||||||
|
This is “field reality” and often the only long-term move with embedded devices.
|
||||||
|
|
||||||
|
### Option 4 — Vendor escalation (Rion)
|
||||||
|
You now have excellent evidence:
|
||||||
|
- LAN-side proof: 2255 dead while 21 alive
|
||||||
|
- WAN packet evidence
|
||||||
|
- clear isolation of RX55 innocence
|
||||||
|
|
||||||
|
This is strong enough to send to Rion support as a firmware defect report.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 12) Repro “wedge bundle” checklist (for future captures)
|
||||||
|
When the wedge happens again, capture these before power cycling:
|
||||||
|
|
||||||
|
1) From server:
|
||||||
|
- `nc -vz 63.45.161.30 2255` (expect refused)
|
||||||
|
- `nc -vz 63.45.161.30 21` (expect success if FTP alive)
|
||||||
|
|
||||||
|
2) From LAN side (via switch/laptop):
|
||||||
|
- `Test-NetConnection 192.168.1.10 -Port 2255`
|
||||||
|
- `Test-NetConnection 192.168.1.10 -Port 21`
|
||||||
|
|
||||||
|
3) Optional: packet capture around the refused attempt.
|
||||||
|
|
||||||
|
4) Record:
|
||||||
|
- last successful poll timestamp
|
||||||
|
- last FTP session timestamp
|
||||||
|
- any scheduled start/stop/download cycles near wedge time
|
||||||
|
- SLMM connection reuse/rotation settings in effect
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 13) Final, current-state summary (as of 2026-02-18)
|
||||||
|
- The issue is **NOT** the RX55 rejecting inbound connections.
|
||||||
|
- The NL-43 is **alive**, reachable on LAN, and FTP works.
|
||||||
|
- The NL-43’s **TCP control listener on 2255 stops listening** while the device remains otherwise healthy.
|
||||||
|
- The wedge can occur hours after successful operations.
|
||||||
|
- The unit is needed in the field immediately, so investigation pauses.
|
||||||
|
- Next phase: controlled tests to isolate trigger + implement mitigation (persistent socket or watchdog reset).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 14) Notes / misc observations
|
||||||
|
- The Wireshark trace showed repeated FTP sessions were opened and closed cleanly, but SLMM’s “FTP requests” were not valid FTP (causing `530 Not logged in`). That was part of experimentation, not a normal workflow.
|
||||||
|
- UDP “success” via netcat is not meaningful because UDP has no handshake; it simply indicates no ICMP unreachable was returned.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**End of document.**
|
||||||
File diff suppressed because it is too large
Load Diff
+322
@@ -0,0 +1,322 @@
|
|||||||
|
"""
|
||||||
|
Threshold alert engine.
|
||||||
|
|
||||||
|
Each unit can have any number of AlertRules. A rule is evaluated against the
|
||||||
|
unit's live monitor snapshots via a small per-(unit, rule) state machine:
|
||||||
|
|
||||||
|
IDLE --(metric exceeds threshold for duration_s)--> ACTIVE (fire ONSET)
|
||||||
|
ACTIVE --(metric recovers past hysteresis for duration_s)--> IDLE (fire CLEAR)
|
||||||
|
|
||||||
|
duration_s debounces both edges; clear_margin_db adds hysteresis so a level
|
||||||
|
hovering at the threshold doesn't flap. Onset and clear are distinct events.
|
||||||
|
|
||||||
|
The state-machine logic (`_evaluate_step`) is intentionally pure — no DB, no
|
||||||
|
real clock — so it can be unit-tested with a synthetic level series and a fake
|
||||||
|
clock. The AlertEvaluator wraps it with rule loading, scheduling, persistence,
|
||||||
|
and dispatch. Dispatch is a server log for now (POC); the seam to POST events to
|
||||||
|
a Terra-View webhook (email/SMS) is _dispatch().
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from typing import Dict, List, Optional, Tuple
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Local timezone offset for schedule windows (same env var services.py uses).
|
||||||
|
_TZ_OFFSET_HOURS = float(os.getenv("TIMEZONE_OFFSET", "-5"))
|
||||||
|
|
||||||
|
# How long to cache a unit's rules before re-querying the DB (rules change rarely).
|
||||||
|
_RULE_CACHE_TTL_S = 15.0
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RuleState:
|
||||||
|
"""In-memory runtime state for one (unit, rule)."""
|
||||||
|
phase: str = "idle" # "idle" | "active"
|
||||||
|
edge_since: Optional[float] = None # when the current edge condition began (clock time)
|
||||||
|
peak: float = 0.0
|
||||||
|
event_id: Optional[int] = None # the open AlertEvent row (for the clear update)
|
||||||
|
last_onset: Optional[float] = None # time of the last onset (for cooldown)
|
||||||
|
|
||||||
|
|
||||||
|
def _exceeds(value: float, rule) -> bool:
|
||||||
|
if rule.comparison == "below":
|
||||||
|
return value < rule.threshold_db
|
||||||
|
return value > rule.threshold_db
|
||||||
|
|
||||||
|
|
||||||
|
def _recovered(value: float, rule) -> bool:
|
||||||
|
margin = rule.clear_margin_db or 0.0
|
||||||
|
if rule.comparison == "below":
|
||||||
|
return value > rule.threshold_db + margin
|
||||||
|
return value < rule.threshold_db - margin
|
||||||
|
|
||||||
|
|
||||||
|
def _evaluate_step(state: RuleState, value: float, now: float, rule) -> Optional[str]:
|
||||||
|
"""Advance the state machine by one reading.
|
||||||
|
|
||||||
|
Pure: mutates `state`, returns 'onset' | 'clear' | None. `now` is injected so
|
||||||
|
tests can drive a fake clock.
|
||||||
|
"""
|
||||||
|
duration = rule.duration_s or 0
|
||||||
|
|
||||||
|
if state.phase == "idle":
|
||||||
|
if _exceeds(value, rule):
|
||||||
|
if state.edge_since is None:
|
||||||
|
state.edge_since = now
|
||||||
|
if now - state.edge_since >= duration:
|
||||||
|
# Cooldown: suppress a new onset within cooldown_s of the last one
|
||||||
|
# (stops a repeatedly-breaching signal from flooding the history).
|
||||||
|
# Hold edge_since so it fires the moment cooldown lapses if still
|
||||||
|
# breaching — don't reset it here.
|
||||||
|
cooldown = getattr(rule, "cooldown_s", 0) or 0
|
||||||
|
if state.last_onset is not None and (now - state.last_onset) < cooldown:
|
||||||
|
return None
|
||||||
|
state.phase = "active"
|
||||||
|
state.edge_since = None
|
||||||
|
state.peak = value
|
||||||
|
state.last_onset = now
|
||||||
|
return "onset"
|
||||||
|
else:
|
||||||
|
state.edge_since = None
|
||||||
|
return None
|
||||||
|
|
||||||
|
# active
|
||||||
|
if rule.comparison == "below":
|
||||||
|
state.peak = min(state.peak, value)
|
||||||
|
else:
|
||||||
|
state.peak = max(state.peak, value)
|
||||||
|
|
||||||
|
if _recovered(value, rule):
|
||||||
|
if state.edge_since is None:
|
||||||
|
state.edge_since = now
|
||||||
|
if now - state.edge_since >= duration:
|
||||||
|
state.phase = "idle"
|
||||||
|
state.edge_since = None
|
||||||
|
return "clear"
|
||||||
|
else:
|
||||||
|
state.edge_since = None
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _in_window(now_minutes: int, start: str, end: str) -> bool:
|
||||||
|
"""Is now_minutes (minutes since local midnight) within [start, end)?
|
||||||
|
Handles wraparound windows like 22:00–07:00."""
|
||||||
|
def _m(s: str) -> int:
|
||||||
|
h, m = s.split(":")
|
||||||
|
return int(h) * 60 + int(m)
|
||||||
|
s, e = _m(start), _m(end)
|
||||||
|
if s == e:
|
||||||
|
return True
|
||||||
|
if s < e:
|
||||||
|
return s <= now_minutes < e
|
||||||
|
return now_minutes >= s or now_minutes < e # wraparound
|
||||||
|
|
||||||
|
|
||||||
|
class AlertEvaluator:
|
||||||
|
def __init__(self):
|
||||||
|
self._states: Dict[Tuple[str, int], RuleState] = {}
|
||||||
|
self._rule_cache: Dict[str, Tuple[float, list]] = {} # unit_id -> (fetched_at, rules)
|
||||||
|
self._offline_events: Dict[str, int] = {} # unit_id -> open connectivity AlertEvent id
|
||||||
|
logger.info("[ALERT] rule-based evaluator ready")
|
||||||
|
|
||||||
|
async def evaluate(self, unit_id: str, snap) -> None:
|
||||||
|
"""Evaluate every enabled rule for this unit against one snapshot."""
|
||||||
|
rules = self._get_rules(unit_id)
|
||||||
|
if not rules:
|
||||||
|
return
|
||||||
|
now = asyncio.get_running_loop().time()
|
||||||
|
for rule in rules:
|
||||||
|
if not self._in_schedule(rule):
|
||||||
|
continue
|
||||||
|
raw = getattr(snap, rule.metric, None)
|
||||||
|
try:
|
||||||
|
value = float(raw)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
continue # missing / non-numeric ("-.-")
|
||||||
|
state = self._states.setdefault((unit_id, rule.id), RuleState())
|
||||||
|
action = _evaluate_step(state, value, now, rule)
|
||||||
|
if action == "onset":
|
||||||
|
await self._on_onset(unit_id, rule, value, state)
|
||||||
|
elif action == "clear":
|
||||||
|
await self._on_clear(unit_id, rule, value, state)
|
||||||
|
|
||||||
|
# -- rule loading (cached) ----------------------------------------------
|
||||||
|
|
||||||
|
def _get_rules(self, unit_id: str) -> list:
|
||||||
|
loop_now = asyncio.get_running_loop().time()
|
||||||
|
cached = self._rule_cache.get(unit_id)
|
||||||
|
if cached and loop_now - cached[0] < _RULE_CACHE_TTL_S:
|
||||||
|
return cached[1]
|
||||||
|
rules = self._load_rules(unit_id)
|
||||||
|
self._rule_cache[unit_id] = (loop_now, rules)
|
||||||
|
return rules
|
||||||
|
|
||||||
|
def _load_rules(self, unit_id: str) -> list:
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import AlertRule
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
return db.query(AlertRule).filter_by(unit_id=unit_id, enabled=True).all()
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[ALERT] failed to load rules for {unit_id}: {e}")
|
||||||
|
return []
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
def invalidate(self, unit_id: Optional[str] = None) -> None:
|
||||||
|
"""Drop cached rules so a change is picked up immediately."""
|
||||||
|
if unit_id is None:
|
||||||
|
self._rule_cache.clear()
|
||||||
|
else:
|
||||||
|
self._rule_cache.pop(unit_id, None)
|
||||||
|
|
||||||
|
def forget_rule(self, unit_id: str, rule_id: int) -> None:
|
||||||
|
"""Drop a rule's per-(unit, rule) state machine after the rule is edited or
|
||||||
|
deleted, so a stale 'active' phase / open event_id from the old config
|
||||||
|
doesn't bleed into the new one (mis-firing a clear or suppressing an onset)."""
|
||||||
|
self._states.pop((unit_id, rule_id), None)
|
||||||
|
|
||||||
|
# -- scheduling ----------------------------------------------------------
|
||||||
|
|
||||||
|
def _in_schedule(self, rule) -> bool:
|
||||||
|
if not rule.schedule_start or not rule.schedule_end:
|
||||||
|
day_ok = self._day_ok(rule)
|
||||||
|
return day_ok
|
||||||
|
local = datetime.utcnow() + timedelta(hours=_TZ_OFFSET_HOURS)
|
||||||
|
if not self._day_ok(rule, local):
|
||||||
|
return False
|
||||||
|
return _in_window(local.hour * 60 + local.minute, rule.schedule_start, rule.schedule_end)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _day_ok(rule, local: Optional[datetime] = None) -> bool:
|
||||||
|
if not rule.schedule_days:
|
||||||
|
return True
|
||||||
|
if local is None:
|
||||||
|
local = datetime.utcnow() + timedelta(hours=_TZ_OFFSET_HOURS)
|
||||||
|
allowed = {int(d) for d in str(rule.schedule_days).split(",") if d.strip() != ""}
|
||||||
|
return local.weekday() in allowed # Mon=0
|
||||||
|
|
||||||
|
# -- event persistence + dispatch ---------------------------------------
|
||||||
|
|
||||||
|
async def _on_onset(self, unit_id: str, rule, value: float, state: RuleState) -> None:
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import AlertEvent
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
evt = AlertEvent(
|
||||||
|
rule_id=rule.id, unit_id=unit_id, rule_name=rule.name,
|
||||||
|
metric=rule.metric, threshold_db=rule.threshold_db,
|
||||||
|
onset_value=value, peak_value=value, status="active",
|
||||||
|
)
|
||||||
|
db.add(evt)
|
||||||
|
db.commit()
|
||||||
|
db.refresh(evt)
|
||||||
|
state.event_id = evt.id
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[ALERT] failed to record onset for {unit_id}: {e}")
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
await self._dispatch(
|
||||||
|
"ONSET", unit_id, rule,
|
||||||
|
f"{rule.metric.upper()}={value:.1f} dB "
|
||||||
|
f"{'<' if rule.comparison == 'below' else '>'} {rule.threshold_db:.1f} dB"
|
||||||
|
f"{f' for {rule.duration_s}s' if rule.duration_s else ''}",
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _on_clear(self, unit_id: str, rule, value: float, state: RuleState) -> None:
|
||||||
|
peak = state.peak
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import AlertEvent
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
if state.event_id is not None:
|
||||||
|
evt = db.query(AlertEvent).filter_by(id=state.event_id).first()
|
||||||
|
if evt:
|
||||||
|
evt.clear_at = datetime.utcnow()
|
||||||
|
evt.peak_value = peak
|
||||||
|
evt.status = "cleared"
|
||||||
|
db.commit()
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[ALERT] failed to record clear for {unit_id}: {e}")
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
state.event_id = None
|
||||||
|
await self._dispatch(
|
||||||
|
"CLEAR", unit_id, rule,
|
||||||
|
f"recovered to {value:.1f} dB (peak {peak:.1f} dB)",
|
||||||
|
)
|
||||||
|
|
||||||
|
# -- connectivity (device offline/online) -------------------------------
|
||||||
|
#
|
||||||
|
# Raised by the live monitor when it loses / regains contact with a device.
|
||||||
|
# Persisted as an AlertEvent (sentinel rule_id=0, metric="connectivity") so it
|
||||||
|
# lands in the same events/inbox/ack pipeline as threshold alerts. The in-memory
|
||||||
|
# map dedupes; the DB query also dedupes across a process restart.
|
||||||
|
|
||||||
|
async def device_offline(self, unit_id: str) -> None:
|
||||||
|
if unit_id in self._offline_events:
|
||||||
|
return # already flagged offline
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import AlertEvent
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
existing = db.query(AlertEvent).filter_by(
|
||||||
|
unit_id=unit_id, metric="connectivity", status="active").first()
|
||||||
|
if existing: # already open in the DB (e.g. carried across a restart)
|
||||||
|
self._offline_events[unit_id] = existing.id
|
||||||
|
return
|
||||||
|
evt = AlertEvent(
|
||||||
|
rule_id=0, unit_id=unit_id, rule_name="Device unreachable",
|
||||||
|
metric="connectivity", threshold_db=0.0, status="active",
|
||||||
|
)
|
||||||
|
db.add(evt)
|
||||||
|
db.commit()
|
||||||
|
db.refresh(evt)
|
||||||
|
self._offline_events[unit_id] = evt.id
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[ALERT] failed to record offline for {unit_id}: {e}")
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
await self._dispatch_raw("OFFLINE", unit_id, "Device unreachable",
|
||||||
|
"live monitor lost contact with the device")
|
||||||
|
|
||||||
|
async def device_online(self, unit_id: str) -> None:
|
||||||
|
self._offline_events.pop(unit_id, None)
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import AlertEvent
|
||||||
|
db = SessionLocal()
|
||||||
|
cleared = 0
|
||||||
|
try:
|
||||||
|
opened = db.query(AlertEvent).filter_by(
|
||||||
|
unit_id=unit_id, metric="connectivity", status="active").all()
|
||||||
|
for evt in opened:
|
||||||
|
evt.clear_at = datetime.utcnow()
|
||||||
|
evt.status = "cleared"
|
||||||
|
cleared += 1
|
||||||
|
if cleared:
|
||||||
|
db.commit()
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[ALERT] failed to record online for {unit_id}: {e}")
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
if cleared: # only announce recovery if it was actually flagged offline
|
||||||
|
await self._dispatch_raw("ONLINE", unit_id, "Device recovered",
|
||||||
|
"live monitor regained contact with the device")
|
||||||
|
|
||||||
|
# -- event persistence + dispatch ---------------------------------------
|
||||||
|
|
||||||
|
async def _dispatch(self, kind: str, unit_id: str, rule, detail: str) -> None:
|
||||||
|
await self._dispatch_raw(kind, unit_id, rule.name, detail)
|
||||||
|
|
||||||
|
async def _dispatch_raw(self, kind: str, unit_id: str, name: str, detail: str) -> None:
|
||||||
|
"""POC dispatch: server log. Swap in a Terra-View webhook (email/SMS) here."""
|
||||||
|
logger.warning(f"[ALERT:{kind}] {unit_id} '{name}': {detail}")
|
||||||
|
|
||||||
|
|
||||||
|
# Module-level singleton (the monitor calls alert_evaluator.evaluate per snapshot)
|
||||||
|
alert_evaluator = AlertEvaluator()
|
||||||
@@ -0,0 +1,411 @@
|
|||||||
|
"""
|
||||||
|
Background polling service for NL43 devices.
|
||||||
|
|
||||||
|
This module provides continuous, automatic polling of configured NL43 devices
|
||||||
|
at configurable intervals. Status snapshots are persisted to the database
|
||||||
|
for fast API access without querying devices on every request.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import NL43Config, NL43Status
|
||||||
|
from app.services import NL43Client, persist_snapshot, sync_measurement_start_time_from_ftp
|
||||||
|
from app.device_logger import log_device_event, cleanup_old_logs
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Global polling default. Set SLMM_POLLING_ENABLED=false to start an instance in
|
||||||
|
# standby (running but not polling and not holding device connections) — e.g. a
|
||||||
|
# dev box that must not latch onto a device that a prod instance owns.
|
||||||
|
POLLING_ENABLED_DEFAULT = os.getenv("SLMM_POLLING_ENABLED", "true").lower() == "true"
|
||||||
|
|
||||||
|
|
||||||
|
class BackgroundPoller:
|
||||||
|
"""
|
||||||
|
Background task that continuously polls NL43 devices and updates status cache.
|
||||||
|
|
||||||
|
Features:
|
||||||
|
- Per-device configurable poll intervals (30 seconds to 6 hours)
|
||||||
|
- Automatic offline detection (marks unreachable after 3 consecutive failures)
|
||||||
|
- Dynamic sleep intervals based on device configurations
|
||||||
|
- Graceful shutdown on application stop
|
||||||
|
- Respects existing rate limiting (1-second minimum between commands)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self._task: Optional[asyncio.Task] = None
|
||||||
|
self._running = False
|
||||||
|
self._logger = logger
|
||||||
|
self._last_cleanup = None # Track last log cleanup time
|
||||||
|
self._last_pool_log = None # Track last connection pool heartbeat log
|
||||||
|
self._active = POLLING_ENABLED_DEFAULT # Global polling on/off (standby toggle)
|
||||||
|
|
||||||
|
async def start(self):
|
||||||
|
"""Start the background polling task."""
|
||||||
|
if self._running:
|
||||||
|
self._logger.warning("Background poller already running")
|
||||||
|
return
|
||||||
|
|
||||||
|
self._running = True
|
||||||
|
self._task = asyncio.create_task(self._poll_loop())
|
||||||
|
self._logger.info("Background poller task created")
|
||||||
|
|
||||||
|
async def stop(self):
|
||||||
|
"""Gracefully stop the background polling task."""
|
||||||
|
if not self._running:
|
||||||
|
return
|
||||||
|
|
||||||
|
self._logger.info("Stopping background poller...")
|
||||||
|
self._running = False
|
||||||
|
|
||||||
|
if self._task:
|
||||||
|
try:
|
||||||
|
await asyncio.wait_for(self._task, timeout=5.0)
|
||||||
|
except asyncio.TimeoutError:
|
||||||
|
self._logger.warning("Background poller task did not stop gracefully, cancelling...")
|
||||||
|
self._task.cancel()
|
||||||
|
try:
|
||||||
|
await self._task
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
self._logger.info("Background poller stopped")
|
||||||
|
|
||||||
|
def is_active(self) -> bool:
|
||||||
|
"""Whether background polling is currently active (vs standby)."""
|
||||||
|
return self._active
|
||||||
|
|
||||||
|
async def set_active(self, active: bool):
|
||||||
|
"""Globally enable/disable polling at runtime.
|
||||||
|
|
||||||
|
When deactivated, the loop stays alive but polls nothing and releases all
|
||||||
|
device connections, so this SLMM instance stops occupying the devices'
|
||||||
|
single connection slots (e.g. so a prod instance can take over). Runtime
|
||||||
|
state only — on restart the instance returns to SLMM_POLLING_ENABLED.
|
||||||
|
"""
|
||||||
|
self._active = active
|
||||||
|
if active:
|
||||||
|
self._logger.info("[SYSTEM] Background polling ACTIVATED")
|
||||||
|
else:
|
||||||
|
self._logger.info("[SYSTEM] Background polling DEACTIVATED (standby) — releasing connections")
|
||||||
|
await self._release_all_connections()
|
||||||
|
|
||||||
|
async def _release_all_connections(self):
|
||||||
|
"""Gracefully close every pooled device connection (no-op if none)."""
|
||||||
|
from app.services import _connection_pool
|
||||||
|
for device_key in list(_connection_pool.get_stats().get("connections", {})):
|
||||||
|
await _connection_pool.discard(device_key)
|
||||||
|
|
||||||
|
async def _poll_loop(self):
|
||||||
|
"""Main polling loop that runs continuously."""
|
||||||
|
self._logger.info("Background polling loop started")
|
||||||
|
|
||||||
|
while self._running:
|
||||||
|
if self._active:
|
||||||
|
try:
|
||||||
|
await self._poll_all_devices()
|
||||||
|
except Exception as e:
|
||||||
|
self._logger.error(f"Error in poll loop: {e}", exc_info=True)
|
||||||
|
else:
|
||||||
|
# Standby: poll nothing, and keep holding no device connection slots
|
||||||
|
# so another SLMM instance (e.g. prod) can talk to the devices.
|
||||||
|
try:
|
||||||
|
await self._release_all_connections()
|
||||||
|
except Exception as e:
|
||||||
|
self._logger.warning(f"Standby connection release failed: {e}")
|
||||||
|
|
||||||
|
# Run log cleanup once per hour
|
||||||
|
try:
|
||||||
|
now = datetime.utcnow()
|
||||||
|
if self._last_cleanup is None or (now - self._last_cleanup).total_seconds() > 3600:
|
||||||
|
cleanup_old_logs()
|
||||||
|
self._last_cleanup = now
|
||||||
|
except Exception as e:
|
||||||
|
self._logger.warning(f"Log cleanup failed: {e}")
|
||||||
|
|
||||||
|
# Log connection pool status every 15 minutes
|
||||||
|
try:
|
||||||
|
now = datetime.utcnow()
|
||||||
|
if self._last_pool_log is None or (now - self._last_pool_log).total_seconds() > 900:
|
||||||
|
from app.services import _connection_pool
|
||||||
|
stats = _connection_pool.get_stats()
|
||||||
|
conns = stats.get("connections", {})
|
||||||
|
if conns:
|
||||||
|
for key, c in conns.items():
|
||||||
|
self._logger.info(
|
||||||
|
f"[POOL] {key} — age={c['age_seconds']}s idle={c['idle_seconds']}s alive={c['alive']}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
self._logger.info("[POOL] No active connections in pool")
|
||||||
|
self._last_pool_log = now
|
||||||
|
except Exception as e:
|
||||||
|
self._logger.warning(f"Pool status log failed: {e}")
|
||||||
|
|
||||||
|
# Calculate dynamic sleep interval
|
||||||
|
sleep_time = self._calculate_sleep_interval()
|
||||||
|
self._logger.debug(f"Sleeping for {sleep_time} seconds until next poll cycle")
|
||||||
|
|
||||||
|
# Sleep in small intervals to allow graceful shutdown
|
||||||
|
for _ in range(int(sleep_time)):
|
||||||
|
if not self._running:
|
||||||
|
break
|
||||||
|
await asyncio.sleep(1)
|
||||||
|
|
||||||
|
self._logger.info("Background polling loop exited")
|
||||||
|
|
||||||
|
async def _poll_all_devices(self):
|
||||||
|
"""Poll all configured devices that are due for polling."""
|
||||||
|
db: Session = SessionLocal()
|
||||||
|
try:
|
||||||
|
# Get all devices with TCP and polling enabled
|
||||||
|
configs = db.query(NL43Config).filter_by(
|
||||||
|
tcp_enabled=True,
|
||||||
|
poll_enabled=True
|
||||||
|
).all()
|
||||||
|
|
||||||
|
if not configs:
|
||||||
|
self._logger.debug("No devices configured for polling")
|
||||||
|
return
|
||||||
|
|
||||||
|
self._logger.debug(f"Checking {len(configs)} devices for polling")
|
||||||
|
now = datetime.utcnow()
|
||||||
|
polled_count = 0
|
||||||
|
|
||||||
|
from app.monitor import monitor_manager
|
||||||
|
|
||||||
|
for cfg in configs:
|
||||||
|
if not self._running:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Skip units with an active live monitor: it polls them at ~1Hz and
|
||||||
|
# keeps the status cache fresh, so a redundant background poll would just
|
||||||
|
# add load/lock-contention on the device's single connection.
|
||||||
|
if monitor_manager.is_active(cfg.unit_id):
|
||||||
|
self._logger.debug(f"Skipping {cfg.unit_id} — live monitor active")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Get current status
|
||||||
|
status = db.query(NL43Status).filter_by(unit_id=cfg.unit_id).first()
|
||||||
|
|
||||||
|
# Check if device should be polled
|
||||||
|
if self._should_poll(cfg, status, now):
|
||||||
|
await self._poll_device(cfg, db)
|
||||||
|
polled_count += 1
|
||||||
|
else:
|
||||||
|
self._logger.debug(f"Skipping {cfg.unit_id} - interval not elapsed")
|
||||||
|
|
||||||
|
if polled_count > 0:
|
||||||
|
self._logger.info(f"Polled {polled_count}/{len(configs)} devices")
|
||||||
|
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
def _should_poll(self, cfg: NL43Config, status: Optional[NL43Status], now: datetime) -> bool:
|
||||||
|
"""
|
||||||
|
Determine if a device should be polled based on interval and last poll time.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
cfg: Device configuration
|
||||||
|
status: Current device status (may be None if never polled)
|
||||||
|
now: Current UTC timestamp
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if device should be polled, False otherwise
|
||||||
|
"""
|
||||||
|
# If never polled before, poll now
|
||||||
|
if not status or not status.last_poll_attempt:
|
||||||
|
self._logger.debug(f"Device {cfg.unit_id} never polled, polling now")
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Calculate elapsed time since last poll attempt
|
||||||
|
interval = cfg.poll_interval_seconds or 60
|
||||||
|
elapsed = (now - status.last_poll_attempt).total_seconds()
|
||||||
|
|
||||||
|
should_poll = elapsed >= interval
|
||||||
|
|
||||||
|
if should_poll:
|
||||||
|
self._logger.debug(
|
||||||
|
f"Device {cfg.unit_id} due for polling: {elapsed:.1f}s elapsed, interval={interval}s"
|
||||||
|
)
|
||||||
|
|
||||||
|
return should_poll
|
||||||
|
|
||||||
|
async def _poll_device(self, cfg: NL43Config, db: Session):
|
||||||
|
"""
|
||||||
|
Poll a single device and update its status in the database.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
cfg: Device configuration
|
||||||
|
db: Database session
|
||||||
|
"""
|
||||||
|
unit_id = cfg.unit_id
|
||||||
|
self._logger.info(f"Polling device {unit_id} at {cfg.host}:{cfg.tcp_port}")
|
||||||
|
|
||||||
|
# Get or create status record
|
||||||
|
status = db.query(NL43Status).filter_by(unit_id=unit_id).first()
|
||||||
|
if not status:
|
||||||
|
status = NL43Status(unit_id=unit_id)
|
||||||
|
db.add(status)
|
||||||
|
|
||||||
|
# Update last_poll_attempt immediately
|
||||||
|
status.last_poll_attempt = datetime.utcnow()
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
# Create client and attempt to poll
|
||||||
|
client = NL43Client(
|
||||||
|
cfg.host,
|
||||||
|
cfg.tcp_port,
|
||||||
|
timeout=5.0,
|
||||||
|
ftp_username=cfg.ftp_username,
|
||||||
|
ftp_password=cfg.ftp_password,
|
||||||
|
ftp_port=cfg.ftp_port or 21
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Send DOD? command to get device status
|
||||||
|
snap = await client.request_dod()
|
||||||
|
snap.unit_id = unit_id
|
||||||
|
|
||||||
|
# Success - persist snapshot and reset failure counter
|
||||||
|
persist_snapshot(snap, db)
|
||||||
|
|
||||||
|
status.is_reachable = True
|
||||||
|
status.consecutive_failures = 0
|
||||||
|
status.last_success = datetime.utcnow()
|
||||||
|
status.last_error = None
|
||||||
|
|
||||||
|
db.commit()
|
||||||
|
self._logger.info(f"✓ Successfully polled {unit_id}")
|
||||||
|
|
||||||
|
# Log to device log
|
||||||
|
log_device_event(
|
||||||
|
unit_id, "INFO", "POLL",
|
||||||
|
f"Poll success: state={snap.measurement_state}, Leq={snap.leq}, Lp={snap.lp}",
|
||||||
|
db
|
||||||
|
)
|
||||||
|
|
||||||
|
# Check if device is measuring but has no start time recorded
|
||||||
|
# This happens if measurement was started before SLMM began polling
|
||||||
|
# or after a service restart
|
||||||
|
status = db.query(NL43Status).filter_by(unit_id=unit_id).first()
|
||||||
|
|
||||||
|
# Reset the sync flag when measurement stops (so next measurement can sync)
|
||||||
|
if status and status.measurement_state != "Start":
|
||||||
|
if status.start_time_sync_attempted:
|
||||||
|
status.start_time_sync_attempted = False
|
||||||
|
db.commit()
|
||||||
|
self._logger.debug(f"Reset FTP sync flag for {unit_id} (measurement stopped)")
|
||||||
|
log_device_event(unit_id, "DEBUG", "STATE", "Measurement stopped, reset FTP sync flag", db)
|
||||||
|
|
||||||
|
# Attempt FTP sync if:
|
||||||
|
# - Device is measuring
|
||||||
|
# - No start time recorded
|
||||||
|
# - FTP sync not already attempted for this measurement
|
||||||
|
# - FTP is configured
|
||||||
|
if (status and
|
||||||
|
status.measurement_state == "Start" and
|
||||||
|
status.measurement_start_time is None and
|
||||||
|
not status.start_time_sync_attempted and
|
||||||
|
cfg.ftp_enabled and
|
||||||
|
cfg.ftp_username and
|
||||||
|
cfg.ftp_password):
|
||||||
|
|
||||||
|
self._logger.info(
|
||||||
|
f"Device {unit_id} is measuring but has no start time - "
|
||||||
|
f"attempting FTP sync"
|
||||||
|
)
|
||||||
|
log_device_event(unit_id, "INFO", "SYNC", "Attempting FTP sync for measurement start time", db)
|
||||||
|
|
||||||
|
# Mark that we attempted sync (prevents repeated attempts on failure)
|
||||||
|
status.start_time_sync_attempted = True
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
try:
|
||||||
|
synced = await sync_measurement_start_time_from_ftp(
|
||||||
|
unit_id=unit_id,
|
||||||
|
host=cfg.host,
|
||||||
|
tcp_port=cfg.tcp_port,
|
||||||
|
ftp_port=cfg.ftp_port or 21,
|
||||||
|
ftp_username=cfg.ftp_username,
|
||||||
|
ftp_password=cfg.ftp_password,
|
||||||
|
db=db
|
||||||
|
)
|
||||||
|
if synced:
|
||||||
|
self._logger.info(f"✓ FTP sync succeeded for {unit_id}")
|
||||||
|
log_device_event(unit_id, "INFO", "SYNC", "FTP sync succeeded - measurement start time updated", db)
|
||||||
|
else:
|
||||||
|
self._logger.warning(f"FTP sync returned False for {unit_id}")
|
||||||
|
log_device_event(unit_id, "WARNING", "SYNC", "FTP sync returned False", db)
|
||||||
|
except Exception as sync_err:
|
||||||
|
self._logger.warning(
|
||||||
|
f"FTP sync failed for {unit_id}: {sync_err}"
|
||||||
|
)
|
||||||
|
log_device_event(unit_id, "ERROR", "SYNC", f"FTP sync failed: {sync_err}", db)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
# Failure - increment counter and potentially mark offline
|
||||||
|
status.consecutive_failures += 1
|
||||||
|
error_msg = str(e)[:500] # Truncate to prevent bloat
|
||||||
|
status.last_error = error_msg
|
||||||
|
|
||||||
|
# Mark unreachable after 3 consecutive failures
|
||||||
|
if status.consecutive_failures >= 3:
|
||||||
|
if status.is_reachable: # Only log transition
|
||||||
|
self._logger.warning(
|
||||||
|
f"Device {unit_id} marked unreachable after {status.consecutive_failures} failures: {error_msg}"
|
||||||
|
)
|
||||||
|
log_device_event(unit_id, "ERROR", "POLL", f"Device marked UNREACHABLE after {status.consecutive_failures} failures: {error_msg}", db)
|
||||||
|
status.is_reachable = False
|
||||||
|
else:
|
||||||
|
self._logger.warning(
|
||||||
|
f"Poll failed for {unit_id} (attempt {status.consecutive_failures}/3): {error_msg}"
|
||||||
|
)
|
||||||
|
log_device_event(unit_id, "WARNING", "POLL", f"Poll failed (attempt {status.consecutive_failures}/3): {error_msg}", db)
|
||||||
|
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
def _calculate_sleep_interval(self) -> int:
|
||||||
|
"""
|
||||||
|
Calculate the next sleep interval based on all device poll intervals.
|
||||||
|
|
||||||
|
Returns a dynamic sleep time that ensures responsive polling:
|
||||||
|
- Minimum 30 seconds (prevents tight loops)
|
||||||
|
- Maximum 300 seconds / 5 minutes (ensures reasonable responsiveness for long intervals)
|
||||||
|
- Generally half the minimum device interval
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Sleep interval in seconds
|
||||||
|
"""
|
||||||
|
db: Session = SessionLocal()
|
||||||
|
try:
|
||||||
|
configs = db.query(NL43Config).filter_by(
|
||||||
|
tcp_enabled=True,
|
||||||
|
poll_enabled=True
|
||||||
|
).all()
|
||||||
|
|
||||||
|
if not configs:
|
||||||
|
return 60 # Default sleep when no devices configured
|
||||||
|
|
||||||
|
# Get all intervals
|
||||||
|
intervals = [cfg.poll_interval_seconds or 60 for cfg in configs]
|
||||||
|
min_interval = min(intervals)
|
||||||
|
|
||||||
|
# Use half the minimum interval, but cap between 30-300 seconds
|
||||||
|
# This allows longer sleep times when polling intervals are long (e.g., hourly)
|
||||||
|
sleep_time = max(30, min(300, min_interval // 2))
|
||||||
|
|
||||||
|
return sleep_time
|
||||||
|
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
|
||||||
|
# Global singleton instance
|
||||||
|
poller = BackgroundPoller()
|
||||||
@@ -0,0 +1,277 @@
|
|||||||
|
"""
|
||||||
|
Per-device logging system.
|
||||||
|
|
||||||
|
Provides dual output: database entries for structured queries and file logs for backup.
|
||||||
|
Each device gets its own log file in data/logs/{unit_id}.log with rotation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from logging.handlers import RotatingFileHandler
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import DeviceLog
|
||||||
|
|
||||||
|
# Configure base logger
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Log directory (persisted in Docker volume)
|
||||||
|
LOG_DIR = Path(os.path.dirname(os.path.dirname(__file__))) / "data" / "logs"
|
||||||
|
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Per-device file loggers (cached)
|
||||||
|
_device_file_loggers: dict = {}
|
||||||
|
|
||||||
|
# Log retention (days)
|
||||||
|
LOG_RETENTION_DAYS = int(os.getenv("LOG_RETENTION_DAYS", "7"))
|
||||||
|
|
||||||
|
|
||||||
|
def _get_file_logger(unit_id: str) -> logging.Logger:
|
||||||
|
"""Get or create a file logger for a specific device."""
|
||||||
|
if unit_id in _device_file_loggers:
|
||||||
|
return _device_file_loggers[unit_id]
|
||||||
|
|
||||||
|
# Create device-specific logger
|
||||||
|
device_logger = logging.getLogger(f"device.{unit_id}")
|
||||||
|
device_logger.setLevel(logging.DEBUG)
|
||||||
|
|
||||||
|
# Avoid duplicate handlers
|
||||||
|
if not device_logger.handlers:
|
||||||
|
# Create rotating file handler (5 MB max, keep 3 backups)
|
||||||
|
log_file = LOG_DIR / f"{unit_id}.log"
|
||||||
|
handler = RotatingFileHandler(
|
||||||
|
log_file,
|
||||||
|
maxBytes=5 * 1024 * 1024, # 5 MB
|
||||||
|
backupCount=3,
|
||||||
|
encoding="utf-8"
|
||||||
|
)
|
||||||
|
handler.setLevel(logging.DEBUG)
|
||||||
|
|
||||||
|
# Format: timestamp [LEVEL] [CATEGORY] message
|
||||||
|
formatter = logging.Formatter(
|
||||||
|
"%(asctime)s [%(levelname)s] [%(category)s] %(message)s",
|
||||||
|
datefmt="%Y-%m-%d %H:%M:%S"
|
||||||
|
)
|
||||||
|
handler.setFormatter(formatter)
|
||||||
|
device_logger.addHandler(handler)
|
||||||
|
|
||||||
|
# Don't propagate to root logger
|
||||||
|
device_logger.propagate = False
|
||||||
|
|
||||||
|
_device_file_loggers[unit_id] = device_logger
|
||||||
|
return device_logger
|
||||||
|
|
||||||
|
|
||||||
|
def log_device_event(
|
||||||
|
unit_id: str,
|
||||||
|
level: str,
|
||||||
|
category: str,
|
||||||
|
message: str,
|
||||||
|
db: Optional[Session] = None
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Log an event for a specific device.
|
||||||
|
|
||||||
|
Writes to both:
|
||||||
|
1. Database (DeviceLog table) for structured queries
|
||||||
|
2. File (data/logs/{unit_id}.log) for backup/debugging
|
||||||
|
|
||||||
|
Args:
|
||||||
|
unit_id: Device identifier
|
||||||
|
level: Log level (DEBUG, INFO, WARNING, ERROR)
|
||||||
|
category: Event category (TCP, FTP, POLL, COMMAND, STATE, SYNC)
|
||||||
|
message: Log message
|
||||||
|
db: Optional database session (creates one if not provided)
|
||||||
|
"""
|
||||||
|
timestamp = datetime.utcnow()
|
||||||
|
|
||||||
|
# Write to file log
|
||||||
|
try:
|
||||||
|
file_logger = _get_file_logger(unit_id)
|
||||||
|
log_func = getattr(file_logger, level.lower(), file_logger.info)
|
||||||
|
# Pass category as extra for formatter
|
||||||
|
log_func(message, extra={"category": category})
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Failed to write file log for {unit_id}: {e}")
|
||||||
|
|
||||||
|
# Write to database
|
||||||
|
close_db = False
|
||||||
|
try:
|
||||||
|
if db is None:
|
||||||
|
db = SessionLocal()
|
||||||
|
close_db = True
|
||||||
|
|
||||||
|
log_entry = DeviceLog(
|
||||||
|
unit_id=unit_id,
|
||||||
|
timestamp=timestamp,
|
||||||
|
level=level.upper(),
|
||||||
|
category=category.upper(),
|
||||||
|
message=message
|
||||||
|
)
|
||||||
|
db.add(log_entry)
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Failed to write DB log for {unit_id}: {e}")
|
||||||
|
if db:
|
||||||
|
db.rollback()
|
||||||
|
finally:
|
||||||
|
if close_db and db:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
|
||||||
|
def cleanup_old_logs(retention_days: Optional[int] = None, db: Optional[Session] = None):
|
||||||
|
"""
|
||||||
|
Delete log entries older than retention period.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
retention_days: Days to retain (default: LOG_RETENTION_DAYS env var or 7)
|
||||||
|
db: Optional database session
|
||||||
|
"""
|
||||||
|
if retention_days is None:
|
||||||
|
retention_days = LOG_RETENTION_DAYS
|
||||||
|
|
||||||
|
cutoff = datetime.utcnow() - timedelta(days=retention_days)
|
||||||
|
|
||||||
|
close_db = False
|
||||||
|
try:
|
||||||
|
if db is None:
|
||||||
|
db = SessionLocal()
|
||||||
|
close_db = True
|
||||||
|
|
||||||
|
deleted = db.query(DeviceLog).filter(DeviceLog.timestamp < cutoff).delete()
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
if deleted > 0:
|
||||||
|
logger.info(f"Cleaned up {deleted} log entries older than {retention_days} days")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to cleanup old logs: {e}")
|
||||||
|
if db:
|
||||||
|
db.rollback()
|
||||||
|
finally:
|
||||||
|
if close_db and db:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
|
||||||
|
def get_device_logs(
|
||||||
|
unit_id: str,
|
||||||
|
limit: int = 100,
|
||||||
|
offset: int = 0,
|
||||||
|
level: Optional[str] = None,
|
||||||
|
category: Optional[str] = None,
|
||||||
|
since: Optional[datetime] = None,
|
||||||
|
db: Optional[Session] = None
|
||||||
|
) -> list:
|
||||||
|
"""
|
||||||
|
Query log entries for a specific device.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
unit_id: Device identifier
|
||||||
|
limit: Max entries to return (default: 100)
|
||||||
|
offset: Number of entries to skip (default: 0)
|
||||||
|
level: Filter by level (DEBUG, INFO, WARNING, ERROR)
|
||||||
|
category: Filter by category (TCP, FTP, POLL, COMMAND, STATE, SYNC)
|
||||||
|
since: Filter entries after this timestamp
|
||||||
|
db: Optional database session
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of log entries as dicts
|
||||||
|
"""
|
||||||
|
close_db = False
|
||||||
|
try:
|
||||||
|
if db is None:
|
||||||
|
db = SessionLocal()
|
||||||
|
close_db = True
|
||||||
|
|
||||||
|
query = db.query(DeviceLog).filter(DeviceLog.unit_id == unit_id)
|
||||||
|
|
||||||
|
if level:
|
||||||
|
query = query.filter(DeviceLog.level == level.upper())
|
||||||
|
if category:
|
||||||
|
query = query.filter(DeviceLog.category == category.upper())
|
||||||
|
if since:
|
||||||
|
query = query.filter(DeviceLog.timestamp >= since)
|
||||||
|
|
||||||
|
# Order by newest first
|
||||||
|
query = query.order_by(DeviceLog.timestamp.desc())
|
||||||
|
|
||||||
|
# Apply pagination
|
||||||
|
entries = query.offset(offset).limit(limit).all()
|
||||||
|
|
||||||
|
return [
|
||||||
|
{
|
||||||
|
"id": e.id,
|
||||||
|
"timestamp": e.timestamp.isoformat() + "Z",
|
||||||
|
"level": e.level,
|
||||||
|
"category": e.category,
|
||||||
|
"message": e.message
|
||||||
|
}
|
||||||
|
for e in entries
|
||||||
|
]
|
||||||
|
|
||||||
|
finally:
|
||||||
|
if close_db and db:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
|
||||||
|
def get_log_stats(unit_id: str, db: Optional[Session] = None) -> dict:
|
||||||
|
"""
|
||||||
|
Get log statistics for a device.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dict with counts by level and category
|
||||||
|
"""
|
||||||
|
close_db = False
|
||||||
|
try:
|
||||||
|
if db is None:
|
||||||
|
db = SessionLocal()
|
||||||
|
close_db = True
|
||||||
|
|
||||||
|
total = db.query(DeviceLog).filter(DeviceLog.unit_id == unit_id).count()
|
||||||
|
|
||||||
|
# Count by level
|
||||||
|
level_counts = {}
|
||||||
|
for level in ["DEBUG", "INFO", "WARNING", "ERROR"]:
|
||||||
|
count = db.query(DeviceLog).filter(
|
||||||
|
DeviceLog.unit_id == unit_id,
|
||||||
|
DeviceLog.level == level
|
||||||
|
).count()
|
||||||
|
if count > 0:
|
||||||
|
level_counts[level] = count
|
||||||
|
|
||||||
|
# Count by category
|
||||||
|
category_counts = {}
|
||||||
|
for category in ["TCP", "FTP", "POLL", "COMMAND", "STATE", "SYNC", "GENERAL"]:
|
||||||
|
count = db.query(DeviceLog).filter(
|
||||||
|
DeviceLog.unit_id == unit_id,
|
||||||
|
DeviceLog.category == category
|
||||||
|
).count()
|
||||||
|
if count > 0:
|
||||||
|
category_counts[category] = count
|
||||||
|
|
||||||
|
# Get oldest and newest
|
||||||
|
oldest = db.query(DeviceLog).filter(
|
||||||
|
DeviceLog.unit_id == unit_id
|
||||||
|
).order_by(DeviceLog.timestamp.asc()).first()
|
||||||
|
|
||||||
|
newest = db.query(DeviceLog).filter(
|
||||||
|
DeviceLog.unit_id == unit_id
|
||||||
|
).order_by(DeviceLog.timestamp.desc()).first()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"total": total,
|
||||||
|
"by_level": level_counts,
|
||||||
|
"by_category": category_counts,
|
||||||
|
"oldest": oldest.timestamp.isoformat() + "Z" if oldest else None,
|
||||||
|
"newest": newest.timestamp.isoformat() + "Z" if newest else None
|
||||||
|
}
|
||||||
|
|
||||||
|
finally:
|
||||||
|
if close_db and db:
|
||||||
|
db.close()
|
||||||
+77
-14
@@ -1,5 +1,6 @@
|
|||||||
import os
|
import os
|
||||||
import logging
|
import logging
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
from fastapi import FastAPI, Request
|
from fastapi import FastAPI, Request
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from fastapi.responses import HTMLResponse
|
from fastapi.responses import HTMLResponse
|
||||||
@@ -7,6 +8,7 @@ from fastapi.templating import Jinja2Templates
|
|||||||
|
|
||||||
from app.database import Base, engine
|
from app.database import Base, engine
|
||||||
from app import routers
|
from app import routers
|
||||||
|
from app.background_poller import poller
|
||||||
|
|
||||||
# Configure logging
|
# Configure logging
|
||||||
logging.basicConfig(
|
logging.basicConfig(
|
||||||
@@ -23,10 +25,54 @@ logger = logging.getLogger(__name__)
|
|||||||
Base.metadata.create_all(bind=engine)
|
Base.metadata.create_all(bind=engine)
|
||||||
logger.info("Database tables initialized")
|
logger.info("Database tables initialized")
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI):
|
||||||
|
"""Manage application lifecycle - startup and shutdown events."""
|
||||||
|
from app.services import _connection_pool
|
||||||
|
|
||||||
|
# Startup
|
||||||
|
logger.info("Starting TCP connection pool cleanup task...")
|
||||||
|
_connection_pool.start_cleanup()
|
||||||
|
logger.info("Starting background poller...")
|
||||||
|
await poller.start()
|
||||||
|
logger.info("Background poller started")
|
||||||
|
|
||||||
|
# Auto-start keepalive live monitors for units configured for 24/7 monitoring
|
||||||
|
# (monitor_enabled). This is what keeps alerting running unattended across
|
||||||
|
# restarts — without it a feed only runs while someone has the live view open.
|
||||||
|
try:
|
||||||
|
from app.monitor import monitor_manager
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import NL43Config
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
units = db.query(NL43Config).filter_by(monitor_enabled=True, tcp_enabled=True).all()
|
||||||
|
for cfg in units:
|
||||||
|
m = await monitor_manager.get(cfg.unit_id)
|
||||||
|
await m.set_keepalive(True)
|
||||||
|
logger.info(f"Auto-started keepalive monitor for {cfg.unit_id}")
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to auto-start monitors: {e}")
|
||||||
|
|
||||||
|
yield # Application runs
|
||||||
|
|
||||||
|
# Shutdown
|
||||||
|
logger.info("Stopping background poller...")
|
||||||
|
await poller.stop()
|
||||||
|
logger.info("Background poller stopped")
|
||||||
|
logger.info("Closing TCP connection pool...")
|
||||||
|
await _connection_pool.close_all()
|
||||||
|
logger.info("TCP connection pool closed")
|
||||||
|
|
||||||
|
|
||||||
app = FastAPI(
|
app = FastAPI(
|
||||||
title="SLMM NL43 Addon",
|
title="SLMM NL43 Addon",
|
||||||
description="Standalone module for NL43 configuration and status APIs",
|
description="Standalone module for NL43 configuration and status APIs with background polling",
|
||||||
version="0.1.0",
|
version="0.4.0",
|
||||||
|
lifespan=lifespan,
|
||||||
)
|
)
|
||||||
|
|
||||||
# CORS configuration - use environment variable for allowed origins
|
# CORS configuration - use environment variable for allowed origins
|
||||||
@@ -49,7 +95,12 @@ app.include_router(routers.router)
|
|||||||
|
|
||||||
@app.get("/", response_class=HTMLResponse)
|
@app.get("/", response_class=HTMLResponse)
|
||||||
def index(request: Request):
|
def index(request: Request):
|
||||||
return templates.TemplateResponse("index.html", {"request": request})
|
return templates.TemplateResponse(request, "index.html")
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/roster", response_class=HTMLResponse)
|
||||||
|
def roster(request: Request):
|
||||||
|
return templates.TemplateResponse(request, "roster.html")
|
||||||
|
|
||||||
|
|
||||||
@app.get("/health")
|
@app.get("/health")
|
||||||
@@ -60,10 +111,14 @@ async def health():
|
|||||||
|
|
||||||
@app.get("/health/devices")
|
@app.get("/health/devices")
|
||||||
async def health_devices():
|
async def health_devices():
|
||||||
"""Enhanced health check that tests device connectivity."""
|
"""Enhanced health check that tests device connectivity.
|
||||||
|
|
||||||
|
Uses the connection pool to avoid unnecessary TCP handshakes — if a
|
||||||
|
cached connection exists and is alive, the device is reachable.
|
||||||
|
"""
|
||||||
from sqlalchemy.orm import Session
|
from sqlalchemy.orm import Session
|
||||||
from app.database import SessionLocal
|
from app.database import SessionLocal
|
||||||
from app.services import NL43Client
|
from app.services import _connection_pool
|
||||||
from app.models import NL43Config
|
from app.models import NL43Config
|
||||||
|
|
||||||
db: Session = SessionLocal()
|
db: Session = SessionLocal()
|
||||||
@@ -73,7 +128,7 @@ async def health_devices():
|
|||||||
configs = db.query(NL43Config).filter_by(tcp_enabled=True).all()
|
configs = db.query(NL43Config).filter_by(tcp_enabled=True).all()
|
||||||
|
|
||||||
for cfg in configs:
|
for cfg in configs:
|
||||||
client = NL43Client(cfg.host, cfg.tcp_port, timeout=2.0, ftp_username=cfg.ftp_username, ftp_password=cfg.ftp_password)
|
device_key = f"{cfg.host}:{cfg.tcp_port}"
|
||||||
status = {
|
status = {
|
||||||
"unit_id": cfg.unit_id,
|
"unit_id": cfg.unit_id,
|
||||||
"host": cfg.host,
|
"host": cfg.host,
|
||||||
@@ -83,14 +138,22 @@ async def health_devices():
|
|||||||
}
|
}
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Try to connect (don't send command to avoid rate limiting issues)
|
# Check if pool already has a live connection (zero-cost check)
|
||||||
import asyncio
|
pool_stats = _connection_pool.get_stats()
|
||||||
reader, writer = await asyncio.wait_for(
|
conn_info = pool_stats["connections"].get(device_key)
|
||||||
asyncio.open_connection(cfg.host, cfg.tcp_port), timeout=2.0
|
if conn_info and conn_info["alive"]:
|
||||||
)
|
status["reachable"] = True
|
||||||
writer.close()
|
status["source"] = "pool"
|
||||||
await writer.wait_closed()
|
else:
|
||||||
status["reachable"] = True
|
# No cached connection — do a lightweight acquire/release
|
||||||
|
# This opens a connection if needed but keeps it in the pool
|
||||||
|
import asyncio
|
||||||
|
reader, writer, from_cache = await _connection_pool.acquire(
|
||||||
|
device_key, cfg.host, cfg.tcp_port, timeout=2.0
|
||||||
|
)
|
||||||
|
await _connection_pool.release(device_key, reader, writer, cfg.host, cfg.tcp_port)
|
||||||
|
status["reachable"] = True
|
||||||
|
status["source"] = "cached" if from_cache else "new"
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
status["error"] = str(type(e).__name__)
|
status["error"] = str(type(e).__name__)
|
||||||
logger.warning(f"Device {cfg.unit_id} health check failed: {e}")
|
logger.warning(f"Device {cfg.unit_id} health check failed: {e}")
|
||||||
|
|||||||
+108
-1
@@ -1,4 +1,4 @@
|
|||||||
from sqlalchemy import Column, String, DateTime, Boolean, Integer, Text, func
|
from sqlalchemy import Column, String, DateTime, Boolean, Integer, Float, Text, func
|
||||||
from app.database import Base
|
from app.database import Base
|
||||||
|
|
||||||
|
|
||||||
@@ -19,6 +19,14 @@ class NL43Config(Base):
|
|||||||
ftp_password = Column(String, nullable=True) # FTP login password
|
ftp_password = Column(String, nullable=True) # FTP login password
|
||||||
web_enabled = Column(Boolean, default=False)
|
web_enabled = Column(Boolean, default=False)
|
||||||
|
|
||||||
|
# Background polling configuration
|
||||||
|
poll_interval_seconds = Column(Integer, nullable=True, default=60) # Polling interval (10-3600 seconds)
|
||||||
|
poll_enabled = Column(Boolean, default=True) # Enable/disable background polling for this device
|
||||||
|
|
||||||
|
# Live monitor (fan-out DOD feed). Keepalive runs it 24/7 even with no viewer,
|
||||||
|
# which is what makes alerting continuous. On by default; toggleable from the UI.
|
||||||
|
monitor_enabled = Column(Boolean, default=True)
|
||||||
|
|
||||||
|
|
||||||
class NL43Status(Base):
|
class NL43Status(Base):
|
||||||
"""
|
"""
|
||||||
@@ -37,8 +45,107 @@ class NL43Status(Base):
|
|||||||
lmax = Column(String, nullable=True) # Maximum level
|
lmax = Column(String, nullable=True) # Maximum level
|
||||||
lmin = Column(String, nullable=True) # Minimum level
|
lmin = Column(String, nullable=True) # Minimum level
|
||||||
lpeak = Column(String, nullable=True) # Peak level
|
lpeak = Column(String, nullable=True) # Peak level
|
||||||
|
ln1 = Column(String, nullable=True) # Percentile slot LN1 (configurable; device default L5, contract L1)
|
||||||
|
ln2 = Column(String, nullable=True) # Percentile slot LN2 (configurable; device default L10)
|
||||||
battery_level = Column(String, nullable=True)
|
battery_level = Column(String, nullable=True)
|
||||||
power_source = Column(String, nullable=True)
|
power_source = Column(String, nullable=True)
|
||||||
sd_remaining_mb = Column(String, nullable=True)
|
sd_remaining_mb = Column(String, nullable=True)
|
||||||
sd_free_ratio = Column(String, nullable=True)
|
sd_free_ratio = Column(String, nullable=True)
|
||||||
raw_payload = Column(Text, nullable=True)
|
raw_payload = Column(Text, nullable=True)
|
||||||
|
|
||||||
|
# Background polling status
|
||||||
|
is_reachable = Column(Boolean, default=True) # Device reachability status
|
||||||
|
consecutive_failures = Column(Integer, default=0) # Count of consecutive poll failures
|
||||||
|
last_poll_attempt = Column(DateTime, nullable=True) # Last time background poller attempted to poll
|
||||||
|
last_success = Column(DateTime, nullable=True) # Last successful poll timestamp
|
||||||
|
last_error = Column(Text, nullable=True) # Last error message (truncated to 500 chars)
|
||||||
|
|
||||||
|
# FTP start time sync tracking
|
||||||
|
start_time_sync_attempted = Column(Boolean, default=False) # True if FTP sync was attempted for current measurement
|
||||||
|
|
||||||
|
|
||||||
|
class DeviceLog(Base):
|
||||||
|
"""
|
||||||
|
Per-device log entries for debugging and audit trail.
|
||||||
|
Stores events like commands, state changes, errors, and FTP operations.
|
||||||
|
"""
|
||||||
|
|
||||||
|
__tablename__ = "device_logs"
|
||||||
|
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
unit_id = Column(String, index=True, nullable=False)
|
||||||
|
timestamp = Column(DateTime, default=func.now(), index=True)
|
||||||
|
level = Column(String, default="INFO") # DEBUG, INFO, WARNING, ERROR
|
||||||
|
category = Column(String, default="GENERAL") # TCP, FTP, POLL, COMMAND, STATE, SYNC
|
||||||
|
message = Column(Text, nullable=False)
|
||||||
|
|
||||||
|
|
||||||
|
class AlertRule(Base):
|
||||||
|
"""A threshold-alert rule evaluated against a unit's live monitor feed.
|
||||||
|
|
||||||
|
Source-agnostic: today it runs over the DOD monitor; the same rule transfers
|
||||||
|
unchanged if a unit's feed is later sourced from FTP intervals.
|
||||||
|
"""
|
||||||
|
|
||||||
|
__tablename__ = "alert_rules"
|
||||||
|
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
unit_id = Column(String, index=True, nullable=False)
|
||||||
|
name = Column(String, nullable=False, default="Alert")
|
||||||
|
metric = Column(String, nullable=False, default="lp") # lp/leq/lmax/lmin/lpeak/ln1/ln2
|
||||||
|
comparison = Column(String, nullable=False, default="above") # above | below
|
||||||
|
threshold_db = Column(Float, nullable=False)
|
||||||
|
duration_s = Column(Integer, nullable=False, default=0) # sustained seconds (0 = instant)
|
||||||
|
clear_margin_db = Column(Float, nullable=False, default=2.0) # hysteresis band
|
||||||
|
cooldown_s = Column(Integer, nullable=False, default=300) # min seconds between onsets
|
||||||
|
# Optional time-of-day scoping (local time). schedule_start/end as "HH:MM";
|
||||||
|
# null = always active. schedule_days = CSV of 0-6 (Mon=0); null = every day.
|
||||||
|
schedule_start = Column(String, nullable=True)
|
||||||
|
schedule_end = Column(String, nullable=True)
|
||||||
|
schedule_days = Column(String, nullable=True)
|
||||||
|
channels = Column(String, nullable=False, default="log") # CSV: log,email,sms
|
||||||
|
recipients = Column(Text, nullable=True) # CSV of emails/phones
|
||||||
|
enabled = Column(Boolean, default=True)
|
||||||
|
created_at = Column(DateTime, default=func.now())
|
||||||
|
|
||||||
|
|
||||||
|
class AlertEvent(Base):
|
||||||
|
"""A fired alert (onset → clear), for history / inbox / acknowledgement."""
|
||||||
|
|
||||||
|
__tablename__ = "alert_events"
|
||||||
|
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
rule_id = Column(Integer, index=True, nullable=False)
|
||||||
|
unit_id = Column(String, index=True, nullable=False)
|
||||||
|
rule_name = Column(String, nullable=True)
|
||||||
|
metric = Column(String, nullable=False)
|
||||||
|
threshold_db = Column(Float, nullable=False)
|
||||||
|
onset_at = Column(DateTime, default=func.now(), index=True)
|
||||||
|
onset_value = Column(Float, nullable=True)
|
||||||
|
peak_value = Column(Float, nullable=True)
|
||||||
|
clear_at = Column(DateTime, nullable=True)
|
||||||
|
status = Column(String, default="active") # active | cleared
|
||||||
|
acknowledged_at = Column(DateTime, nullable=True)
|
||||||
|
acknowledged_by = Column(String, nullable=True)
|
||||||
|
notes = Column(Text, nullable=True)
|
||||||
|
|
||||||
|
|
||||||
|
class NL43Reading(Base):
|
||||||
|
"""Downsampled time-series of live-monitor readings, for the live-chart
|
||||||
|
backfill (so a viewer sees recent trend on open, not a blank chart).
|
||||||
|
|
||||||
|
Viewing only — NOT the report source. Reports use the device's authoritative
|
||||||
|
FTP .rnd intervals. This is a short, capped trail (one row/minute, pruned to
|
||||||
|
a retention window) fed by the monitor's keepalive poll loop.
|
||||||
|
"""
|
||||||
|
|
||||||
|
__tablename__ = "nl43_readings"
|
||||||
|
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
unit_id = Column(String, index=True, nullable=False)
|
||||||
|
timestamp = Column(DateTime, default=func.now(), index=True)
|
||||||
|
lp = Column(String, nullable=True)
|
||||||
|
leq = Column(String, nullable=True)
|
||||||
|
lmax = Column(String, nullable=True)
|
||||||
|
ln1 = Column(String, nullable=True)
|
||||||
|
ln2 = Column(String, nullable=True)
|
||||||
|
|||||||
+322
@@ -0,0 +1,322 @@
|
|||||||
|
"""
|
||||||
|
Per-device live monitor (fan-out hub).
|
||||||
|
|
||||||
|
ONE DOD poll loop per device, broadcast to many subscribers:
|
||||||
|
- browser WebSocket clients (live view) — they no longer each open their own
|
||||||
|
device stream, so the NL43's single-connection limit stops causing the
|
||||||
|
"second viewer sees nothing" contention.
|
||||||
|
- the alert evaluator (threshold alerts), which can keep a device's feed running
|
||||||
|
even with no browser attached.
|
||||||
|
- persistence (each snapshot is written to NL43Status, like the poller does).
|
||||||
|
|
||||||
|
The device's one TCP connection is respected: every poll goes through the same
|
||||||
|
per-device lock + connection pool in services.py, so the monitor, the background
|
||||||
|
poller, and on-demand commands all serialize safely.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Dict, Optional, Set
|
||||||
|
|
||||||
|
from app.database import SessionLocal
|
||||||
|
from app.models import NL43Config, NL43Status
|
||||||
|
from app.services import NL43Client, persist_snapshot
|
||||||
|
from app.alerts import alert_evaluator
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Extra idle between DOD polls WHEN A BROWSER IS WATCHING. The 1s device rate-limit
|
||||||
|
# already paces consecutive DOD? commands, so this just needs to be small — the
|
||||||
|
# rate-limit is the real floor (~1.25s/poll effective).
|
||||||
|
MONITOR_POLL_INTERVAL = float(os.getenv("MONITOR_POLL_INTERVAL", "0.25"))
|
||||||
|
|
||||||
|
# Idle cadence when NO browser is subscribed and the feed is only kept alive for
|
||||||
|
# alerting. Same data, ~8x fewer polls -> ~8x less cellular traffic on a metered
|
||||||
|
# SIM (~1 GB/device/month at full rate -> ~125 MB). NOTE: this also sets the alert
|
||||||
|
# sampling resolution when nobody is watching, so keep it <= the smallest alert
|
||||||
|
# duration_s you rely on (default 10s comfortably catches a "sustained 30/60s" rule).
|
||||||
|
MONITOR_IDLE_POLL_INTERVAL = float(os.getenv("MONITOR_IDLE_POLL_INTERVAL", "10"))
|
||||||
|
|
||||||
|
# Exponential backoff once the device is unreachable, so a powered-off / asleep /
|
||||||
|
# out-of-signal device stops churning reconnects every cycle (log spam + a trickle
|
||||||
|
# of wasted cellular data on failed SYNs). delay = min(BASE * 2**(fails-1), MAX),
|
||||||
|
# reset to full-rate on the first good poll. While a browser is actively watching we
|
||||||
|
# cap the backoff lower (WATCHED_MAX) so a recovery surfaces quickly for the viewer.
|
||||||
|
MONITOR_BACKOFF_BASE_S = float(os.getenv("MONITOR_BACKOFF_BASE_S", "1"))
|
||||||
|
MONITOR_BACKOFF_MAX_S = float(os.getenv("MONITOR_BACKOFF_MAX_S", "60"))
|
||||||
|
MONITOR_BACKOFF_WATCHED_MAX_S = float(os.getenv("MONITOR_BACKOFF_WATCHED_MAX_S", "5"))
|
||||||
|
|
||||||
|
# How often to refresh the run state (Measure?). It changes rarely, so we cache it
|
||||||
|
# and skip that second rate-limited command on most polls — roughly halving the
|
||||||
|
# per-update latency (~2.5s -> ~1.3s).
|
||||||
|
MONITOR_STATE_REFRESH_S = float(os.getenv("MONITOR_STATE_REFRESH_S", "30"))
|
||||||
|
|
||||||
|
# Downsampled trail for the live-chart backfill: store one reading per
|
||||||
|
# TRAIL_SAMPLE_S and keep TRAIL_RETENTION_HOURS of it (pruned). Viewing only —
|
||||||
|
# reports use the device's FTP .rnd data, not this.
|
||||||
|
TRAIL_SAMPLE_S = float(os.getenv("MONITOR_TRAIL_SAMPLE_S", "60"))
|
||||||
|
TRAIL_RETENTION_HOURS = float(os.getenv("MONITOR_TRAIL_RETENTION_HOURS", "24"))
|
||||||
|
|
||||||
|
# If nothing has been broadcast in this many seconds (e.g. device offline and
|
||||||
|
# silent), send a keepalive frame so reverse proxies don't drop the idle WS.
|
||||||
|
MONITOR_HEARTBEAT_S = float(os.getenv("MONITOR_HEARTBEAT_S", "25"))
|
||||||
|
|
||||||
|
|
||||||
|
def _snapshot_payload(snap, unit_id: str, measurement_start_time) -> dict:
|
||||||
|
"""Build the broadcast payload — same shape as the DRD stream, but DOD-sourced
|
||||||
|
so it carries ln1/ln2 (which DRD cannot)."""
|
||||||
|
return {
|
||||||
|
"unit_id": unit_id,
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"measurement_state": snap.measurement_state,
|
||||||
|
"measurement_start_time": measurement_start_time,
|
||||||
|
"counter": snap.counter,
|
||||||
|
"lp": snap.lp,
|
||||||
|
"leq": snap.leq,
|
||||||
|
"lmax": snap.lmax,
|
||||||
|
"lmin": snap.lmin,
|
||||||
|
"lpeak": snap.lpeak,
|
||||||
|
"ln1": snap.ln1,
|
||||||
|
"ln2": snap.ln2,
|
||||||
|
"raw_payload": snap.raw_payload,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class DeviceMonitor:
|
||||||
|
"""Owns a single DOD poll loop for one device and fans each snapshot out to
|
||||||
|
all subscribers. Runs while it has at least one browser subscriber OR the
|
||||||
|
server-side keep-alive (alerting) flag is set."""
|
||||||
|
|
||||||
|
def __init__(self, unit_id: str):
|
||||||
|
self.unit_id = unit_id
|
||||||
|
self._subscribers: Set[asyncio.Queue] = set()
|
||||||
|
self._keepalive = False
|
||||||
|
self._task: Optional[asyncio.Task] = None
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
self._last_payload: Optional[dict] = None # replayed to new subscribers
|
||||||
|
self._consec_fail = 0
|
||||||
|
self._reachable = True # last broadcast reachability (for transition frames)
|
||||||
|
self._cached_state: Optional[str] = None # run state, refreshed periodically
|
||||||
|
self._last_state_refresh = 0.0
|
||||||
|
self._last_trail_store = 0.0 # downsample throttle for the backfill trail
|
||||||
|
|
||||||
|
@property
|
||||||
|
def running(self) -> bool:
|
||||||
|
return self._task is not None and not self._task.done()
|
||||||
|
|
||||||
|
def subscriber_count(self) -> int:
|
||||||
|
return len(self._subscribers)
|
||||||
|
|
||||||
|
def _has_demand(self) -> bool:
|
||||||
|
return bool(self._subscribers) or self._keepalive
|
||||||
|
|
||||||
|
def _ensure_task(self) -> None:
|
||||||
|
if self._task is None or self._task.done():
|
||||||
|
self._task = asyncio.create_task(self._run())
|
||||||
|
|
||||||
|
async def subscribe(self) -> asyncio.Queue:
|
||||||
|
q: asyncio.Queue = asyncio.Queue(maxsize=5)
|
||||||
|
async with self._lock:
|
||||||
|
self._subscribers.add(q)
|
||||||
|
# Replay the last frame so a client connecting mid-stream sees data
|
||||||
|
# (or the current 'unreachable' state) immediately, not after a poll.
|
||||||
|
if self._last_payload is not None:
|
||||||
|
try:
|
||||||
|
q.put_nowait(self._last_payload)
|
||||||
|
except asyncio.QueueFull:
|
||||||
|
pass
|
||||||
|
self._ensure_task()
|
||||||
|
return q
|
||||||
|
|
||||||
|
async def unsubscribe(self, q: asyncio.Queue) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
self._subscribers.discard(q)
|
||||||
|
|
||||||
|
async def set_keepalive(self, on: bool) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
self._keepalive = on
|
||||||
|
if on:
|
||||||
|
self._ensure_task()
|
||||||
|
|
||||||
|
async def _run(self) -> None:
|
||||||
|
logger.info(f"[MONITOR] {self.unit_id}: feed started")
|
||||||
|
loop = asyncio.get_running_loop()
|
||||||
|
last_send = loop.time()
|
||||||
|
try:
|
||||||
|
while self._has_demand():
|
||||||
|
snap, mst = await self._poll_once()
|
||||||
|
if snap is not None:
|
||||||
|
if not self._reachable:
|
||||||
|
# Recovered from an outage — clear the connectivity alert.
|
||||||
|
try:
|
||||||
|
await alert_evaluator.device_online(self.unit_id)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[MONITOR] {self.unit_id}: online alert failed: {e}")
|
||||||
|
self._consec_fail = 0
|
||||||
|
self._reachable = True
|
||||||
|
payload = _snapshot_payload(snap, self.unit_id, mst)
|
||||||
|
payload["feed_status"] = "ok"
|
||||||
|
self._broadcast(payload)
|
||||||
|
last_send = loop.time()
|
||||||
|
try:
|
||||||
|
await alert_evaluator.evaluate(self.unit_id, snap)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[MONITOR] {self.unit_id}: alert eval failed: {e}")
|
||||||
|
else:
|
||||||
|
# Tell clients the device went offline — once, on transition, after a
|
||||||
|
# few failures so a momentary blip doesn't flap the UI. Same edge
|
||||||
|
# raises the device-offline alert.
|
||||||
|
self._consec_fail += 1
|
||||||
|
if self._reachable and self._consec_fail >= 3:
|
||||||
|
self._reachable = False
|
||||||
|
self._broadcast({
|
||||||
|
"unit_id": self.unit_id,
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"feed_status": "unreachable",
|
||||||
|
})
|
||||||
|
last_send = loop.time()
|
||||||
|
try:
|
||||||
|
await alert_evaluator.device_offline(self.unit_id)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[MONITOR] {self.unit_id}: offline alert failed: {e}")
|
||||||
|
|
||||||
|
# Heartbeat: during quiet/offline stretches, send a keepalive so an
|
||||||
|
# idle WS isn't dropped by a reverse proxy. Not cached (new subscribers
|
||||||
|
# should still get the last real frame, not a heartbeat).
|
||||||
|
if loop.time() - last_send >= MONITOR_HEARTBEAT_S:
|
||||||
|
self._broadcast({
|
||||||
|
"unit_id": self.unit_id,
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"feed_status": "ok" if self._reachable else "unreachable",
|
||||||
|
"heartbeat": True,
|
||||||
|
}, cache=False)
|
||||||
|
last_send = loop.time()
|
||||||
|
|
||||||
|
await asyncio.sleep(self._next_delay())
|
||||||
|
finally:
|
||||||
|
logger.info(f"[MONITOR] {self.unit_id}: feed stopped")
|
||||||
|
|
||||||
|
def _next_delay(self) -> float:
|
||||||
|
"""Inter-poll delay: exponential backoff while unreachable, full-rate while a
|
||||||
|
browser is watching, relaxed cadence when the feed is keepalive-only."""
|
||||||
|
if self._consec_fail > 0:
|
||||||
|
shift = min(self._consec_fail - 1, 6) # cap growth at 2**6 = 64x base
|
||||||
|
delay = min(MONITOR_BACKOFF_BASE_S * (2 ** shift), MONITOR_BACKOFF_MAX_S)
|
||||||
|
if self._subscribers:
|
||||||
|
delay = min(delay, MONITOR_BACKOFF_WATCHED_MAX_S)
|
||||||
|
return delay
|
||||||
|
if self._subscribers:
|
||||||
|
return MONITOR_POLL_INTERVAL # a browser is watching — smooth chart
|
||||||
|
return MONITOR_IDLE_POLL_INTERVAL # keepalive-only (alerting) — save data
|
||||||
|
|
||||||
|
async def _poll_once(self):
|
||||||
|
"""One DOD poll: read, persist, return (snapshot, measurement_start_iso)."""
|
||||||
|
db = SessionLocal()
|
||||||
|
try:
|
||||||
|
cfg = db.query(NL43Config).filter_by(unit_id=self.unit_id).first()
|
||||||
|
if not cfg or not cfg.tcp_enabled:
|
||||||
|
return None, None
|
||||||
|
client = NL43Client(
|
||||||
|
cfg.host, cfg.tcp_port,
|
||||||
|
ftp_username=cfg.ftp_username, ftp_password=cfg.ftp_password,
|
||||||
|
ftp_port=cfg.ftp_port or 21,
|
||||||
|
)
|
||||||
|
# Refresh the run state only every MONITOR_STATE_REFRESH_S; reuse the
|
||||||
|
# cached state otherwise so most polls send just DOD? (one rate-limited
|
||||||
|
# command) instead of DOD? + Measure?.
|
||||||
|
now = asyncio.get_running_loop().time()
|
||||||
|
refresh_state = (self._cached_state is None
|
||||||
|
or now - self._last_state_refresh >= MONITOR_STATE_REFRESH_S)
|
||||||
|
snap = await client.request_dod(
|
||||||
|
measurement_state=None if refresh_state else self._cached_state
|
||||||
|
)
|
||||||
|
if refresh_state:
|
||||||
|
self._cached_state = snap.measurement_state
|
||||||
|
self._last_state_refresh = now
|
||||||
|
snap.unit_id = self.unit_id
|
||||||
|
persist_snapshot(snap, db)
|
||||||
|
db.commit()
|
||||||
|
# Append to the downsampled backfill trail (~one row per TRAIL_SAMPLE_S).
|
||||||
|
if now - self._last_trail_store >= TRAIL_SAMPLE_S:
|
||||||
|
self._last_trail_store = now
|
||||||
|
self._store_trail(snap, db)
|
||||||
|
status = db.query(NL43Status).filter_by(unit_id=self.unit_id).first()
|
||||||
|
mst = (status.measurement_start_time.isoformat()
|
||||||
|
if status and status.measurement_start_time else None)
|
||||||
|
return snap, mst
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[MONITOR] {self.unit_id}: poll failed: {e}")
|
||||||
|
return None, None
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
|
||||||
|
def _store_trail(self, snap, db) -> None:
|
||||||
|
"""Append one downsampled reading to the backfill trail and prune old rows."""
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from app.models import NL43Reading
|
||||||
|
try:
|
||||||
|
db.add(NL43Reading(
|
||||||
|
unit_id=self.unit_id, timestamp=datetime.utcnow(),
|
||||||
|
lp=snap.lp, leq=snap.leq, lmax=snap.lmax, ln1=snap.ln1, ln2=snap.ln2,
|
||||||
|
))
|
||||||
|
cutoff = datetime.utcnow() - timedelta(hours=TRAIL_RETENTION_HOURS)
|
||||||
|
db.query(NL43Reading).filter(
|
||||||
|
NL43Reading.unit_id == self.unit_id,
|
||||||
|
NL43Reading.timestamp < cutoff,
|
||||||
|
).delete()
|
||||||
|
db.commit()
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"[MONITOR] {self.unit_id}: trail store failed: {e}")
|
||||||
|
|
||||||
|
def _broadcast(self, payload: dict, cache: bool = True) -> None:
|
||||||
|
if cache:
|
||||||
|
self._last_payload = payload # replayed to new subscribers
|
||||||
|
for q in list(self._subscribers):
|
||||||
|
try:
|
||||||
|
q.put_nowait(payload)
|
||||||
|
except asyncio.QueueFull:
|
||||||
|
# Slow consumer — drop this frame rather than stall the whole feed.
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class MonitorManager:
|
||||||
|
"""Registry of per-device monitors (one per unit_id)."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self._monitors: Dict[str, DeviceMonitor] = {}
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
|
||||||
|
async def get(self, unit_id: str) -> DeviceMonitor:
|
||||||
|
async with self._lock:
|
||||||
|
m = self._monitors.get(unit_id)
|
||||||
|
if m is None:
|
||||||
|
m = DeviceMonitor(unit_id)
|
||||||
|
self._monitors[unit_id] = m
|
||||||
|
return m
|
||||||
|
|
||||||
|
def is_active(self, unit_id: str) -> bool:
|
||||||
|
"""True if this unit has a running monitor feed (so the background poller
|
||||||
|
can skip it — the monitor already polls it more often)."""
|
||||||
|
m = self._monitors.get(unit_id)
|
||||||
|
return m is not None and m.running
|
||||||
|
|
||||||
|
def status(self) -> dict:
|
||||||
|
return {
|
||||||
|
uid: {
|
||||||
|
"running": m.running,
|
||||||
|
"subscribers": m.subscriber_count(),
|
||||||
|
"keepalive": m._keepalive,
|
||||||
|
"reachable": m._reachable,
|
||||||
|
# what cadence the loop is currently using, for observability
|
||||||
|
"mode": ("backoff" if m._consec_fail > 0
|
||||||
|
else "watched" if m._subscribers
|
||||||
|
else "idle"),
|
||||||
|
}
|
||||||
|
for uid, m in self._monitors.items()
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# Module-level singleton
|
||||||
|
monitor_manager = MonitorManager()
|
||||||
+1032
-69
File diff suppressed because it is too large
Load Diff
+1079
-216
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,67 @@
|
|||||||
|
# SLMM Archive
|
||||||
|
|
||||||
|
This directory contains legacy scripts that are no longer needed for normal operation but are preserved for reference.
|
||||||
|
|
||||||
|
## Legacy Migrations (`legacy_migrations/`)
|
||||||
|
|
||||||
|
These migration scripts were used during SLMM development (v0.1.x) to incrementally add database fields. They are **no longer needed** because:
|
||||||
|
|
||||||
|
1. **Fresh databases** get the complete schema automatically from `app/models.py`
|
||||||
|
2. **Existing databases** should already have these fields from previous runs
|
||||||
|
3. **Current migration** is `migrate_add_polling_fields.py` (v0.2.0) in the parent directory
|
||||||
|
|
||||||
|
### Archived Migration Files
|
||||||
|
|
||||||
|
- `migrate_add_counter.py` - Added `counter` field to NL43Status
|
||||||
|
- `migrate_add_measurement_start_time.py` - Added `measurement_start_time` field
|
||||||
|
- `migrate_add_ftp_port.py` - Added `ftp_port` field to NL43Config
|
||||||
|
- `migrate_field_names.py` - Renamed fields for consistency (one-time fix)
|
||||||
|
- `migrate_revert_field_names.py` - Rollback for the rename migration
|
||||||
|
|
||||||
|
**Do not delete** - These provide historical context for database schema evolution.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Legacy Tools
|
||||||
|
|
||||||
|
### `nl43_dod_poll.py`
|
||||||
|
|
||||||
|
Manual polling script that queries a single NL-43 device for DOD (Device On-Demand) data.
|
||||||
|
|
||||||
|
**Status**: Replaced by background polling system in v0.2.0
|
||||||
|
|
||||||
|
**Why archived**:
|
||||||
|
- Background poller (`app/background_poller.py`) now handles continuous polling automatically
|
||||||
|
- No need for manual polling scripts
|
||||||
|
- Kept for reference in case manual querying is needed for debugging
|
||||||
|
|
||||||
|
**How to use** (if needed):
|
||||||
|
```bash
|
||||||
|
cd /home/serversdown/tmi/slmm/archive
|
||||||
|
python3 nl43_dod_poll.py <host> <port> <unit_id>
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Active Scripts (Still in Parent Directory)
|
||||||
|
|
||||||
|
These scripts are **actively used** and documented in the main README:
|
||||||
|
|
||||||
|
### Migrations
|
||||||
|
- `migrate_add_polling_fields.py` - **v0.2.0 migration** - Adds background polling fields
|
||||||
|
- `migrate_add_ftp_credentials.py` - **Legacy FTP migration** - Adds FTP auth fields
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
- `test_polling.sh` - Comprehensive test suite for background polling features
|
||||||
|
- `test_settings_endpoint.py` - Tests device settings API
|
||||||
|
- `test_sleep_mode_auto_disable.py` - Tests automatic sleep mode handling
|
||||||
|
|
||||||
|
### Utilities
|
||||||
|
- `set_ftp_credentials.py` - Command-line tool to set FTP credentials for a device
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Version History
|
||||||
|
|
||||||
|
- **v0.2.0** (2026-01-15) - Background polling system added, manual polling scripts archived
|
||||||
|
- **v0.1.0** (2025-12-XX) - Initial release with incremental migrations
|
||||||
+1
-1
@@ -483,7 +483,7 @@ POST /{unit_id}/ftp/enable
|
|||||||
```
|
```
|
||||||
Enables FTP server on the device.
|
Enables FTP server on the device.
|
||||||
|
|
||||||
**Note:** FTP and TCP are mutually exclusive. Enabling FTP will temporarily disable TCP control.
|
**Note:** ~~FTP and TCP are mutually exclusive. Enabling FTP will temporarily disable TCP control.~~ As of v0.2.0, FTP and TCP are working fine in tandem. Just dont spam them a bunch.
|
||||||
|
|
||||||
### Disable FTP
|
### Disable FTP
|
||||||
```
|
```
|
||||||
|
|||||||
+246
@@ -0,0 +1,246 @@
|
|||||||
|
# SLMM Roster Management
|
||||||
|
|
||||||
|
The SLMM standalone application now includes a roster management interface for viewing and configuring all Sound Level Meter devices.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
### Web Interface
|
||||||
|
|
||||||
|
Access the roster at: **http://localhost:8100/roster**
|
||||||
|
|
||||||
|
The roster page provides:
|
||||||
|
|
||||||
|
- **Device List Table**: View all configured SLMs with their connection details
|
||||||
|
- **Real-time Status**: See device connectivity status (Online/Offline/Stale)
|
||||||
|
- **Add Device**: Create new device configurations with a user-friendly modal form
|
||||||
|
- **Edit Device**: Modify existing device configurations
|
||||||
|
- **Delete Device**: Remove device configurations (does not affect physical devices)
|
||||||
|
- **Test Connection**: Run diagnostics on individual devices
|
||||||
|
|
||||||
|
### Table Columns
|
||||||
|
|
||||||
|
| Column | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| Unit ID | Unique identifier for the device |
|
||||||
|
| Host / IP | Device IP address or hostname |
|
||||||
|
| TCP Port | TCP control port (default: 2255) |
|
||||||
|
| FTP Port | FTP file transfer port (default: 21) |
|
||||||
|
| TCP | Whether TCP control is enabled |
|
||||||
|
| FTP | Whether FTP file transfer is enabled |
|
||||||
|
| Polling | Whether background polling is enabled |
|
||||||
|
| Status | Device connectivity status (Online/Offline/Stale) |
|
||||||
|
| Actions | Test, Edit, Delete buttons |
|
||||||
|
|
||||||
|
### Status Indicators
|
||||||
|
|
||||||
|
- **Online** (green): Device responded within the last 5 minutes
|
||||||
|
- **Stale** (yellow): Device hasn't responded recently but was seen before
|
||||||
|
- **Offline** (red): Device is unreachable or has consecutive failures
|
||||||
|
- **Unknown** (gray): No status data available yet
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### List All Devices
|
||||||
|
|
||||||
|
```bash
|
||||||
|
GET /api/nl43/roster
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns all configured devices with their status information.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "ok",
|
||||||
|
"devices": [
|
||||||
|
{
|
||||||
|
"unit_id": "SLM-43-01",
|
||||||
|
"host": "192.168.1.100",
|
||||||
|
"tcp_port": 2255,
|
||||||
|
"ftp_port": 21,
|
||||||
|
"tcp_enabled": true,
|
||||||
|
"ftp_enabled": true,
|
||||||
|
"ftp_username": "USER",
|
||||||
|
"ftp_password": "0000",
|
||||||
|
"web_enabled": false,
|
||||||
|
"poll_enabled": true,
|
||||||
|
"poll_interval_seconds": 60,
|
||||||
|
"status": {
|
||||||
|
"last_seen": "2026-01-16T20:00:00",
|
||||||
|
"measurement_state": "Start",
|
||||||
|
"is_reachable": true,
|
||||||
|
"consecutive_failures": 0,
|
||||||
|
"last_success": "2026-01-16T20:00:00",
|
||||||
|
"last_error": null
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"total": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Create New Device
|
||||||
|
|
||||||
|
```bash
|
||||||
|
POST /api/nl43/roster
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"unit_id": "SLM-43-01",
|
||||||
|
"host": "192.168.1.100",
|
||||||
|
"tcp_port": 2255,
|
||||||
|
"ftp_port": 21,
|
||||||
|
"tcp_enabled": true,
|
||||||
|
"ftp_enabled": false,
|
||||||
|
"poll_enabled": true,
|
||||||
|
"poll_interval_seconds": 60
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Required Fields:**
|
||||||
|
- `unit_id`: Unique device identifier
|
||||||
|
- `host`: IP address or hostname
|
||||||
|
|
||||||
|
**Optional Fields:**
|
||||||
|
- `tcp_port`: TCP control port (default: 2255)
|
||||||
|
- `ftp_port`: FTP port (default: 21)
|
||||||
|
- `tcp_enabled`: Enable TCP control (default: true)
|
||||||
|
- `ftp_enabled`: Enable FTP transfers (default: false)
|
||||||
|
- `ftp_username`: FTP username (only if ftp_enabled)
|
||||||
|
- `ftp_password`: FTP password (only if ftp_enabled)
|
||||||
|
- `poll_enabled`: Enable background polling (default: true)
|
||||||
|
- `poll_interval_seconds`: Polling interval 10-3600 seconds (default: 60)
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "ok",
|
||||||
|
"message": "Device SLM-43-01 created successfully",
|
||||||
|
"data": {
|
||||||
|
"unit_id": "SLM-43-01",
|
||||||
|
"host": "192.168.1.100",
|
||||||
|
"tcp_port": 2255,
|
||||||
|
"tcp_enabled": true,
|
||||||
|
"ftp_enabled": false,
|
||||||
|
"poll_enabled": true,
|
||||||
|
"poll_interval_seconds": 60
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update Device
|
||||||
|
|
||||||
|
```bash
|
||||||
|
PUT /api/nl43/{unit_id}/config
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"host": "192.168.1.101",
|
||||||
|
"tcp_port": 2255,
|
||||||
|
"poll_interval_seconds": 120
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All fields are optional. Only include fields you want to update.
|
||||||
|
|
||||||
|
### Delete Device
|
||||||
|
|
||||||
|
```bash
|
||||||
|
DELETE /api/nl43/{unit_id}/config
|
||||||
|
```
|
||||||
|
|
||||||
|
Removes the device configuration and associated status data. Does not affect the physical device.
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "ok",
|
||||||
|
"message": "Deleted device SLM-43-01"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage Examples
|
||||||
|
|
||||||
|
### Via Web Interface
|
||||||
|
|
||||||
|
1. Navigate to http://localhost:8100/roster
|
||||||
|
2. Click "Add Device" to create a new configuration
|
||||||
|
3. Fill in the device details (unit ID, IP address, ports)
|
||||||
|
4. Configure TCP, FTP, and polling settings
|
||||||
|
5. Click "Save Device"
|
||||||
|
6. Use "Test" button to verify connectivity
|
||||||
|
7. Edit or delete devices as needed
|
||||||
|
|
||||||
|
### Via API (curl)
|
||||||
|
|
||||||
|
**Add a new device:**
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8100/api/nl43/roster \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"unit_id": "slm-site-a",
|
||||||
|
"host": "192.168.1.100",
|
||||||
|
"tcp_port": 2255,
|
||||||
|
"tcp_enabled": true,
|
||||||
|
"ftp_enabled": true,
|
||||||
|
"ftp_username": "USER",
|
||||||
|
"ftp_password": "0000",
|
||||||
|
"poll_enabled": true,
|
||||||
|
"poll_interval_seconds": 60
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Update device host:**
|
||||||
|
```bash
|
||||||
|
curl -X PUT http://localhost:8100/api/nl43/slm-site-a/config \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"host": "192.168.1.101"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Delete device:**
|
||||||
|
```bash
|
||||||
|
curl -X DELETE http://localhost:8100/api/nl43/slm-site-a/config
|
||||||
|
```
|
||||||
|
|
||||||
|
**List all devices:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8100/api/nl43/roster | python3 -m json.tool
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration with Terra-View
|
||||||
|
|
||||||
|
When SLMM is used as a module within Terra-View:
|
||||||
|
|
||||||
|
1. Terra-View manages device configurations in its own database
|
||||||
|
2. Terra-View syncs configurations to SLMM via `PUT /api/nl43/{unit_id}/config`
|
||||||
|
3. Terra-View can query device status via `GET /api/nl43/{unit_id}/status`
|
||||||
|
4. SLMM's roster page can be used for standalone testing and diagnostics
|
||||||
|
|
||||||
|
## Background Polling
|
||||||
|
|
||||||
|
Devices with `poll_enabled: true` are automatically polled at their configured interval:
|
||||||
|
|
||||||
|
- Polls device status every `poll_interval_seconds` (10-3600 seconds)
|
||||||
|
- Updates `NL43Status` table with latest measurements
|
||||||
|
- Tracks device reachability and failure counts
|
||||||
|
- Provides real-time status updates in the roster
|
||||||
|
|
||||||
|
**Note**: Polling respects the NL43 protocol's 1-second rate limit between commands.
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
The roster system validates:
|
||||||
|
|
||||||
|
- **Unit ID**: Must be unique across all devices
|
||||||
|
- **Host**: Valid IP address or hostname format
|
||||||
|
- **Ports**: Must be between 1-65535
|
||||||
|
- **Poll Interval**: Must be between 10-3600 seconds
|
||||||
|
- **Duplicate Check**: Returns 409 Conflict if unit_id already exists
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Deleting a device from the roster does NOT affect the physical device
|
||||||
|
- Device configurations are stored in the SLMM database (`data/slmm.db`)
|
||||||
|
- Status information is updated by the background polling system
|
||||||
|
- The roster page auto-refreshes status indicators
|
||||||
|
- Test button runs full diagnostics (connectivity, TCP, FTP if enabled)
|
||||||
@@ -0,0 +1,26 @@
|
|||||||
|
# SLMM Feature Documentation
|
||||||
|
|
||||||
|
This directory contains detailed documentation for specific SLMM features and enhancements.
|
||||||
|
|
||||||
|
## Feature Documents
|
||||||
|
|
||||||
|
### FEATURE_SUMMARY.md
|
||||||
|
Overview of all major features in SLMM.
|
||||||
|
|
||||||
|
### SETTINGS_ENDPOINT.md
|
||||||
|
Documentation of the device settings endpoint and verification system.
|
||||||
|
|
||||||
|
### TIMEZONE_CONFIGURATION.md
|
||||||
|
Timezone handling and configuration for SLMM timestamps.
|
||||||
|
|
||||||
|
### SLEEP_MODE_AUTO_DISABLE.md
|
||||||
|
Automatic sleep mode wake-up system for background polling.
|
||||||
|
|
||||||
|
### UI_UPDATE.md
|
||||||
|
UI/UX improvements and interface updates.
|
||||||
|
|
||||||
|
## Related Documentation
|
||||||
|
|
||||||
|
- [../README.md](../../README.md) - Main SLMM documentation
|
||||||
|
- [../CHANGELOG.md](../../CHANGELOG.md) - Version history
|
||||||
|
- [../API.md](../../API.md) - Complete API reference
|
||||||
@@ -0,0 +1,73 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Database migration: Add device_logs table.
|
||||||
|
|
||||||
|
This table stores per-device log entries for debugging and audit trail.
|
||||||
|
|
||||||
|
Run this once to add the new table.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import os
|
||||||
|
|
||||||
|
# Path to the SLMM database
|
||||||
|
DB_PATH = os.path.join(os.path.dirname(__file__), "data", "slmm.db")
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
print(f"Adding device_logs table to: {DB_PATH}")
|
||||||
|
|
||||||
|
if not os.path.exists(DB_PATH):
|
||||||
|
print("Database does not exist yet. Table will be created automatically on first run.")
|
||||||
|
return
|
||||||
|
|
||||||
|
conn = sqlite3.connect(DB_PATH)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Check if table already exists
|
||||||
|
cursor.execute("""
|
||||||
|
SELECT name FROM sqlite_master
|
||||||
|
WHERE type='table' AND name='device_logs'
|
||||||
|
""")
|
||||||
|
if cursor.fetchone():
|
||||||
|
print("✓ device_logs table already exists, no migration needed")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Create the table
|
||||||
|
print("Creating device_logs table...")
|
||||||
|
cursor.execute("""
|
||||||
|
CREATE TABLE device_logs (
|
||||||
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
|
unit_id VARCHAR NOT NULL,
|
||||||
|
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||||
|
level VARCHAR DEFAULT 'INFO',
|
||||||
|
category VARCHAR DEFAULT 'GENERAL',
|
||||||
|
message TEXT NOT NULL
|
||||||
|
)
|
||||||
|
""")
|
||||||
|
|
||||||
|
# Create indexes for efficient querying
|
||||||
|
print("Creating indexes...")
|
||||||
|
cursor.execute("CREATE INDEX ix_device_logs_unit_id ON device_logs (unit_id)")
|
||||||
|
cursor.execute("CREATE INDEX ix_device_logs_timestamp ON device_logs (timestamp)")
|
||||||
|
|
||||||
|
conn.commit()
|
||||||
|
print("✓ Created device_logs table with indexes")
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
cursor.execute("""
|
||||||
|
SELECT name FROM sqlite_master
|
||||||
|
WHERE type='table' AND name='device_logs'
|
||||||
|
""")
|
||||||
|
if not cursor.fetchone():
|
||||||
|
raise Exception("device_logs table was not created successfully")
|
||||||
|
|
||||||
|
print("✓ Migration completed successfully")
|
||||||
|
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
migrate()
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Migration script to add ln1 and ln2 percentile columns to the nl43_status table.
|
||||||
|
|
||||||
|
The NL-43 DOD response carries percentile slots LN1-LN5; the live SLM display
|
||||||
|
(Terra-View) shows two of them (default L1/L10). This adds storage for the two
|
||||||
|
surfaced slots. Run once per database to update existing schema.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
DB_PATH = Path(__file__).parent / "data" / "slmm.db"
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
"""Add ln1 and ln2 columns to the nl43_status table."""
|
||||||
|
|
||||||
|
if not DB_PATH.exists():
|
||||||
|
print(f"Database not found at {DB_PATH}")
|
||||||
|
print("No migration needed - database will be created with new schema")
|
||||||
|
return
|
||||||
|
|
||||||
|
conn = sqlite3.connect(DB_PATH)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
|
||||||
|
try:
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_status)")
|
||||||
|
columns = [row[1] for row in cursor.fetchall()]
|
||||||
|
|
||||||
|
if "ln1" in columns and "ln2" in columns:
|
||||||
|
print("✓ ln1/ln2 columns already exist, no migration needed")
|
||||||
|
return
|
||||||
|
|
||||||
|
if "ln1" not in columns:
|
||||||
|
print("Adding ln1 column...")
|
||||||
|
cursor.execute("ALTER TABLE nl43_status ADD COLUMN ln1 TEXT")
|
||||||
|
print("✓ Added ln1 column")
|
||||||
|
|
||||||
|
if "ln2" not in columns:
|
||||||
|
print("Adding ln2 column...")
|
||||||
|
cursor.execute("ALTER TABLE nl43_status ADD COLUMN ln2 TEXT")
|
||||||
|
print("✓ Added ln2 column")
|
||||||
|
|
||||||
|
conn.commit()
|
||||||
|
print("\n✓ Migration completed successfully!")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
conn.rollback()
|
||||||
|
print(f"✗ Migration failed: {e}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
migrate()
|
||||||
@@ -0,0 +1,48 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Migration: add monitor_enabled column to nl43_config.
|
||||||
|
|
||||||
|
Controls whether the live fan-out DOD monitor is kept alive 24/7 for a unit
|
||||||
|
(which is what makes alerting continuous). Defaults to enabled. Run once per DB.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
DB_PATH = Path(__file__).parent / "data" / "slmm.db"
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
if not DB_PATH.exists():
|
||||||
|
print(f"Database not found at {DB_PATH}")
|
||||||
|
print("No migration needed - database will be created with new schema")
|
||||||
|
return
|
||||||
|
|
||||||
|
conn = sqlite3.connect(DB_PATH)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
try:
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_config)")
|
||||||
|
columns = [row[1] for row in cursor.fetchall()]
|
||||||
|
|
||||||
|
if "monitor_enabled" in columns:
|
||||||
|
print("✓ monitor_enabled column already exists, no migration needed")
|
||||||
|
return
|
||||||
|
|
||||||
|
print("Adding monitor_enabled column (default enabled)...")
|
||||||
|
# SQLite stores booleans as 0/1; default 1 = enabled.
|
||||||
|
cursor.execute("ALTER TABLE nl43_config ADD COLUMN monitor_enabled BOOLEAN DEFAULT 1")
|
||||||
|
conn.commit()
|
||||||
|
print("✓ Added monitor_enabled column")
|
||||||
|
print("\n✓ Migration completed successfully!")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
conn.rollback()
|
||||||
|
print(f"✗ Migration failed: {e}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
migrate()
|
||||||
@@ -0,0 +1,136 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Migration script to add polling-related fields to nl43_config and nl43_status tables.
|
||||||
|
|
||||||
|
Adds to nl43_config:
|
||||||
|
- poll_interval_seconds (INTEGER, default 60)
|
||||||
|
- poll_enabled (BOOLEAN, default 1/True)
|
||||||
|
|
||||||
|
Adds to nl43_status:
|
||||||
|
- is_reachable (BOOLEAN, default 1/True)
|
||||||
|
- consecutive_failures (INTEGER, default 0)
|
||||||
|
- last_poll_attempt (DATETIME, nullable)
|
||||||
|
- last_success (DATETIME, nullable)
|
||||||
|
- last_error (TEXT, nullable)
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python migrate_add_polling_fields.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
db_path = Path("data/slmm.db")
|
||||||
|
|
||||||
|
if not db_path.exists():
|
||||||
|
print(f"❌ Database not found at {db_path}")
|
||||||
|
print(" Run this script from the slmm directory")
|
||||||
|
return False
|
||||||
|
|
||||||
|
try:
|
||||||
|
conn = sqlite3.connect(db_path)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
|
||||||
|
# Check nl43_config columns
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_config)")
|
||||||
|
config_columns = [row[1] for row in cursor.fetchall()]
|
||||||
|
|
||||||
|
# Check nl43_status columns
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_status)")
|
||||||
|
status_columns = [row[1] for row in cursor.fetchall()]
|
||||||
|
|
||||||
|
changes_made = False
|
||||||
|
|
||||||
|
# Add nl43_config columns
|
||||||
|
if "poll_interval_seconds" not in config_columns:
|
||||||
|
print("Adding poll_interval_seconds to nl43_config...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_config
|
||||||
|
ADD COLUMN poll_interval_seconds INTEGER DEFAULT 60
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ poll_interval_seconds already exists in nl43_config")
|
||||||
|
|
||||||
|
if "poll_enabled" not in config_columns:
|
||||||
|
print("Adding poll_enabled to nl43_config...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_config
|
||||||
|
ADD COLUMN poll_enabled BOOLEAN DEFAULT 1
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ poll_enabled already exists in nl43_config")
|
||||||
|
|
||||||
|
# Add nl43_status columns
|
||||||
|
if "is_reachable" not in status_columns:
|
||||||
|
print("Adding is_reachable to nl43_status...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN is_reachable BOOLEAN DEFAULT 1
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ is_reachable already exists in nl43_status")
|
||||||
|
|
||||||
|
if "consecutive_failures" not in status_columns:
|
||||||
|
print("Adding consecutive_failures to nl43_status...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN consecutive_failures INTEGER DEFAULT 0
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ consecutive_failures already exists in nl43_status")
|
||||||
|
|
||||||
|
if "last_poll_attempt" not in status_columns:
|
||||||
|
print("Adding last_poll_attempt to nl43_status...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN last_poll_attempt DATETIME
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ last_poll_attempt already exists in nl43_status")
|
||||||
|
|
||||||
|
if "last_success" not in status_columns:
|
||||||
|
print("Adding last_success to nl43_status...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN last_success DATETIME
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ last_success already exists in nl43_status")
|
||||||
|
|
||||||
|
if "last_error" not in status_columns:
|
||||||
|
print("Adding last_error to nl43_status...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN last_error TEXT
|
||||||
|
""")
|
||||||
|
changes_made = True
|
||||||
|
else:
|
||||||
|
print("✓ last_error already exists in nl43_status")
|
||||||
|
|
||||||
|
if changes_made:
|
||||||
|
conn.commit()
|
||||||
|
print("\n✓ Migration completed successfully")
|
||||||
|
print(" Added polling-related fields to nl43_config and nl43_status")
|
||||||
|
else:
|
||||||
|
print("\n✓ All polling fields already exist - no changes needed")
|
||||||
|
|
||||||
|
conn.close()
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"❌ Migration failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
success = migrate()
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
@@ -0,0 +1,60 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Database migration: Add start_time_sync_attempted field to nl43_status table.
|
||||||
|
|
||||||
|
This field tracks whether FTP sync has been attempted for the current measurement,
|
||||||
|
preventing repeated sync attempts when FTP fails.
|
||||||
|
|
||||||
|
Run this once to add the new column.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
import os
|
||||||
|
|
||||||
|
# Path to the SLMM database
|
||||||
|
DB_PATH = os.path.join(os.path.dirname(__file__), "data", "slmm.db")
|
||||||
|
|
||||||
|
|
||||||
|
def migrate():
|
||||||
|
print(f"Adding start_time_sync_attempted field to: {DB_PATH}")
|
||||||
|
|
||||||
|
if not os.path.exists(DB_PATH):
|
||||||
|
print("Database does not exist yet. Column will be created automatically.")
|
||||||
|
return
|
||||||
|
|
||||||
|
conn = sqlite3.connect(DB_PATH)
|
||||||
|
cursor = conn.cursor()
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Check if column already exists
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_status)")
|
||||||
|
columns = [col[1] for col in cursor.fetchall()]
|
||||||
|
|
||||||
|
if 'start_time_sync_attempted' in columns:
|
||||||
|
print("✓ start_time_sync_attempted column already exists, no migration needed")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Add the column
|
||||||
|
print("Adding start_time_sync_attempted column...")
|
||||||
|
cursor.execute("""
|
||||||
|
ALTER TABLE nl43_status
|
||||||
|
ADD COLUMN start_time_sync_attempted BOOLEAN DEFAULT 0
|
||||||
|
""")
|
||||||
|
conn.commit()
|
||||||
|
print("✓ Added start_time_sync_attempted column")
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
cursor.execute("PRAGMA table_info(nl43_status)")
|
||||||
|
columns = [col[1] for col in cursor.fetchall()]
|
||||||
|
|
||||||
|
if 'start_time_sync_attempted' not in columns:
|
||||||
|
raise Exception("start_time_sync_attempted column was not added successfully")
|
||||||
|
|
||||||
|
print("✓ Migration completed successfully")
|
||||||
|
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
migrate()
|
||||||
+255
-10
@@ -31,6 +31,11 @@
|
|||||||
<body>
|
<body>
|
||||||
<h1>SLMM NL43 Standalone</h1>
|
<h1>SLMM NL43 Standalone</h1>
|
||||||
<p>Configure a unit (host/port), then use controls to Start/Stop and fetch live status.</p>
|
<p>Configure a unit (host/port), then use controls to Start/Stop and fetch live status.</p>
|
||||||
|
<p style="margin-bottom: 16px;">
|
||||||
|
<a href="/roster" style="color: #0969da; text-decoration: none; font-weight: 600;">📊 View Device Roster</a>
|
||||||
|
<span style="margin: 0 8px; color: #d0d7de;">|</span>
|
||||||
|
<a href="/docs" style="color: #0969da; text-decoration: none;">API Documentation</a>
|
||||||
|
</p>
|
||||||
|
|
||||||
<fieldset>
|
<fieldset>
|
||||||
<legend>🔍 Connection Diagnostics</legend>
|
<legend>🔍 Connection Diagnostics</legend>
|
||||||
@@ -40,13 +45,34 @@
|
|||||||
</fieldset>
|
</fieldset>
|
||||||
|
|
||||||
<fieldset>
|
<fieldset>
|
||||||
<legend>Unit Config</legend>
|
<legend>Unit Selection & Config</legend>
|
||||||
<label>Unit ID</label>
|
|
||||||
<input id="unitId" value="nl43-1" />
|
<div style="display: flex; gap: 8px; align-items: flex-end; margin-bottom: 12px;">
|
||||||
<label>Host</label>
|
<div style="flex: 1;">
|
||||||
<input id="host" value="127.0.0.1" />
|
<label>Select Device</label>
|
||||||
<label>Port</label>
|
<select id="deviceSelector" onchange="loadSelectedDevice()" style="width: 100%; padding: 8px; margin-bottom: 0;">
|
||||||
<input id="port" type="number" value="80" />
|
<option value="">-- Select a device --</option>
|
||||||
|
</select>
|
||||||
|
</div>
|
||||||
|
<button onclick="refreshDeviceList()" style="padding: 8px 12px;">↻ Refresh</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div style="padding: 12px; background: #f6f8fa; border: 1px solid #d0d7de; border-radius: 4px; margin-bottom: 12px;">
|
||||||
|
<div style="display: flex; gap: 16px;">
|
||||||
|
<div style="flex: 1;">
|
||||||
|
<label>Unit ID</label>
|
||||||
|
<input id="unitId" value="nl43-1" />
|
||||||
|
</div>
|
||||||
|
<div style="flex: 2;">
|
||||||
|
<label>Host</label>
|
||||||
|
<input id="host" value="127.0.0.1" />
|
||||||
|
</div>
|
||||||
|
<div style="flex: 1;">
|
||||||
|
<label>TCP Port</label>
|
||||||
|
<input id="port" type="number" value="2255" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
<div style="margin: 12px 0;">
|
<div style="margin: 12px 0;">
|
||||||
<label style="display: inline-flex; align-items: center; margin-right: 16px;">
|
<label style="display: inline-flex; align-items: center; margin-right: 16px;">
|
||||||
@@ -66,8 +92,10 @@
|
|||||||
<input id="ftpPassword" type="password" value="0000" />
|
<input id="ftpPassword" type="password" value="0000" />
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<button onclick="saveConfig()" style="margin-top: 12px;">Save Config</button>
|
<div style="margin-top: 12px;">
|
||||||
<button onclick="loadConfig()">Load Config</button>
|
<button onclick="saveConfig()">Save Config</button>
|
||||||
|
<button onclick="loadConfig()">Load Config</button>
|
||||||
|
</div>
|
||||||
</fieldset>
|
</fieldset>
|
||||||
|
|
||||||
<fieldset>
|
<fieldset>
|
||||||
@@ -148,6 +176,7 @@
|
|||||||
|
|
||||||
let ws = null;
|
let ws = null;
|
||||||
let streamUpdateCount = 0;
|
let streamUpdateCount = 0;
|
||||||
|
let availableDevices = [];
|
||||||
|
|
||||||
function log(msg) {
|
function log(msg) {
|
||||||
logEl.textContent += msg + "\n";
|
logEl.textContent += msg + "\n";
|
||||||
@@ -160,9 +189,97 @@
|
|||||||
ftpCredentials.style.display = ftpEnabled ? 'block' : 'none';
|
ftpCredentials.style.display = ftpEnabled ? 'block' : 'none';
|
||||||
}
|
}
|
||||||
|
|
||||||
// Add event listener for FTP checkbox
|
// Load device list from roster
|
||||||
|
async function refreshDeviceList() {
|
||||||
|
try {
|
||||||
|
const res = await fetch('/api/nl43/roster');
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
log('Failed to load device list');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
availableDevices = data.devices || [];
|
||||||
|
const selector = document.getElementById('deviceSelector');
|
||||||
|
|
||||||
|
// Save current selection
|
||||||
|
const currentSelection = selector.value;
|
||||||
|
|
||||||
|
// Clear and rebuild options
|
||||||
|
selector.innerHTML = '<option value="">-- Select a device --</option>';
|
||||||
|
|
||||||
|
availableDevices.forEach(device => {
|
||||||
|
const option = document.createElement('option');
|
||||||
|
option.value = device.unit_id;
|
||||||
|
|
||||||
|
// Add status indicator
|
||||||
|
let statusIcon = '⚪';
|
||||||
|
if (device.status) {
|
||||||
|
if (device.status.is_reachable === false) {
|
||||||
|
statusIcon = '🔴';
|
||||||
|
} else if (device.status.last_success) {
|
||||||
|
const lastSeen = new Date(device.status.last_success);
|
||||||
|
const ageMinutes = Math.floor((Date.now() - lastSeen) / 60000);
|
||||||
|
statusIcon = ageMinutes < 5 ? '🟢' : '🟡';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
option.textContent = `${statusIcon} ${device.unit_id} (${device.host})`;
|
||||||
|
selector.appendChild(option);
|
||||||
|
});
|
||||||
|
|
||||||
|
// Restore selection if it still exists
|
||||||
|
if (currentSelection && availableDevices.find(d => d.unit_id === currentSelection)) {
|
||||||
|
selector.value = currentSelection;
|
||||||
|
}
|
||||||
|
|
||||||
|
log(`Loaded ${availableDevices.length} device(s) from roster`);
|
||||||
|
} catch (err) {
|
||||||
|
log(`Error loading device list: ${err.message}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Load selected device configuration
|
||||||
|
function loadSelectedDevice() {
|
||||||
|
const selector = document.getElementById('deviceSelector');
|
||||||
|
const unitId = selector.value;
|
||||||
|
|
||||||
|
if (!unitId) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const device = availableDevices.find(d => d.unit_id === unitId);
|
||||||
|
if (!device) {
|
||||||
|
log(`Device ${unitId} not found in list`);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Populate form fields
|
||||||
|
document.getElementById('unitId').value = device.unit_id;
|
||||||
|
document.getElementById('host').value = device.host;
|
||||||
|
document.getElementById('port').value = device.tcp_port || 2255;
|
||||||
|
document.getElementById('tcpEnabled').checked = device.tcp_enabled || false;
|
||||||
|
document.getElementById('ftpEnabled').checked = device.ftp_enabled || false;
|
||||||
|
|
||||||
|
if (device.ftp_username) {
|
||||||
|
document.getElementById('ftpUsername').value = device.ftp_username;
|
||||||
|
}
|
||||||
|
if (device.ftp_password) {
|
||||||
|
document.getElementById('ftpPassword').value = device.ftp_password;
|
||||||
|
}
|
||||||
|
|
||||||
|
toggleFtpCredentials();
|
||||||
|
|
||||||
|
log(`Loaded configuration for ${device.unit_id}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add event listeners
|
||||||
document.addEventListener('DOMContentLoaded', function() {
|
document.addEventListener('DOMContentLoaded', function() {
|
||||||
document.getElementById('ftpEnabled').addEventListener('change', toggleFtpCredentials);
|
document.getElementById('ftpEnabled').addEventListener('change', toggleFtpCredentials);
|
||||||
|
|
||||||
|
// Load device list on page load
|
||||||
|
refreshDeviceList();
|
||||||
});
|
});
|
||||||
|
|
||||||
async function runDiagnostics() {
|
async function runDiagnostics() {
|
||||||
@@ -216,6 +333,134 @@
|
|||||||
|
|
||||||
html += `<p style="margin-top: 12px; font-size: 0.9em; color: #666;">Last run: ${new Date(data.timestamp).toLocaleString()}</p>`;
|
html += `<p style="margin-top: 12px; font-size: 0.9em; color: #666;">Last run: ${new Date(data.timestamp).toLocaleString()}</p>`;
|
||||||
|
|
||||||
|
// Add database dump section if available
|
||||||
|
if (data.database_dump) {
|
||||||
|
html += `<div style="margin-top: 16px; border-top: 1px solid #d0d7de; padding-top: 12px;">`;
|
||||||
|
html += `<h4 style="margin: 0 0 12px 0;">📦 Database Dump</h4>`;
|
||||||
|
|
||||||
|
// Config section
|
||||||
|
if (data.database_dump.config) {
|
||||||
|
const cfg = data.database_dump.config;
|
||||||
|
html += `<div style="background: #f0f4f8; padding: 12px; border-radius: 4px; margin-bottom: 12px;">`;
|
||||||
|
html += `<strong>Configuration (nl43_config)</strong>`;
|
||||||
|
html += `<table style="width: 100%; margin-top: 8px; font-size: 0.9em;">`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Host</td><td>${cfg.host}:${cfg.tcp_port}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">TCP Enabled</td><td>${cfg.tcp_enabled ? '✓' : '✗'}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">FTP Enabled</td><td>${cfg.ftp_enabled ? '✓' : '✗'}${cfg.ftp_enabled ? ` (port ${cfg.ftp_port}, user: ${cfg.ftp_username || 'none'})` : ''}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Background Polling</td><td>${cfg.poll_enabled ? `✓ every ${cfg.poll_interval_seconds}s` : '✗ disabled'}</td></tr>`;
|
||||||
|
html += `</table></div>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Status cache section
|
||||||
|
if (data.database_dump.status_cache) {
|
||||||
|
const cache = data.database_dump.status_cache;
|
||||||
|
html += `<div style="background: #f0f8f4; padding: 12px; border-radius: 4px; margin-bottom: 12px;">`;
|
||||||
|
html += `<strong>Status Cache (nl43_status)</strong>`;
|
||||||
|
html += `<table style="width: 100%; margin-top: 8px; font-size: 0.9em;">`;
|
||||||
|
|
||||||
|
// Measurement state and timing
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Measurement State</td><td><strong>${cache.measurement_state || 'unknown'}</strong></td></tr>`;
|
||||||
|
if (cache.measurement_start_time) {
|
||||||
|
const startTime = new Date(cache.measurement_start_time);
|
||||||
|
const elapsed = Math.floor((Date.now() - startTime) / 1000);
|
||||||
|
const elapsedStr = elapsed > 3600 ? `${Math.floor(elapsed/3600)}h ${Math.floor((elapsed%3600)/60)}m` : elapsed > 60 ? `${Math.floor(elapsed/60)}m ${elapsed%60}s` : `${elapsed}s`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Measurement Started</td><td>${startTime.toLocaleString()} (${elapsedStr} ago)</td></tr>`;
|
||||||
|
}
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Counter (d0)</td><td>${cache.counter || 'N/A'}</td></tr>`;
|
||||||
|
|
||||||
|
// Sound levels
|
||||||
|
html += `<tr><td colspan="2" style="padding: 8px 8px 2px 8px; font-weight: 600; border-top: 1px solid #d0d7de;">Sound Levels (dB)</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Lp (Instantaneous)</td><td>${cache.lp || 'N/A'}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Leq (Equivalent)</td><td>${cache.leq || 'N/A'}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Lmax / Lmin</td><td>${cache.lmax || 'N/A'} / ${cache.lmin || 'N/A'}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Lpeak</td><td>${cache.lpeak || 'N/A'}</td></tr>`;
|
||||||
|
|
||||||
|
// Device status
|
||||||
|
html += `<tr><td colspan="2" style="padding: 8px 8px 2px 8px; font-weight: 600; border-top: 1px solid #d0d7de;">Device Status</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Battery</td><td>${cache.battery_level || 'N/A'}${cache.power_source ? ` (${cache.power_source})` : ''}</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">SD Card</td><td>${cache.sd_remaining_mb ? `${cache.sd_remaining_mb} MB` : 'N/A'}${cache.sd_free_ratio ? ` (${cache.sd_free_ratio} free)` : ''}</td></tr>`;
|
||||||
|
|
||||||
|
// Polling status
|
||||||
|
html += `<tr><td colspan="2" style="padding: 8px 8px 2px 8px; font-weight: 600; border-top: 1px solid #d0d7de;">Polling Status</td></tr>`;
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Reachable</td><td>${cache.is_reachable ? '🟢 Yes' : '🔴 No'}</td></tr>`;
|
||||||
|
if (cache.last_seen) {
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Last Seen</td><td>${new Date(cache.last_seen).toLocaleString()}</td></tr>`;
|
||||||
|
}
|
||||||
|
if (cache.last_success) {
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Last Success</td><td>${new Date(cache.last_success).toLocaleString()}</td></tr>`;
|
||||||
|
}
|
||||||
|
if (cache.last_poll_attempt) {
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Last Poll Attempt</td><td>${new Date(cache.last_poll_attempt).toLocaleString()}</td></tr>`;
|
||||||
|
}
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Consecutive Failures</td><td>${cache.consecutive_failures || 0}</td></tr>`;
|
||||||
|
if (cache.last_error) {
|
||||||
|
html += `<tr><td style="padding: 2px 8px; color: #666;">Last Error</td><td style="color: #d00; font-size: 0.85em;">${cache.last_error}</td></tr>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
html += `</table></div>`;
|
||||||
|
|
||||||
|
// Raw payload (collapsible)
|
||||||
|
if (cache.raw_payload) {
|
||||||
|
html += `<details style="margin-top: 8px;"><summary style="cursor: pointer; color: #666; font-size: 0.9em;">📄 Raw Payload</summary>`;
|
||||||
|
html += `<pre style="background: #f6f8fa; padding: 8px; border-radius: 4px; font-size: 0.8em; overflow-x: auto; margin-top: 8px;">${cache.raw_payload}</pre></details>`;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
html += `<p style="color: #888; font-style: italic;">No cached status available for this unit.</p>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
html += `</div>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fetch and display device logs
|
||||||
|
try {
|
||||||
|
const logsRes = await fetch(`/api/nl43/${unitId}/logs?limit=50`);
|
||||||
|
if (logsRes.ok) {
|
||||||
|
const logsData = await logsRes.json();
|
||||||
|
if (logsData.logs && logsData.logs.length > 0) {
|
||||||
|
html += `<div style="margin-top: 16px; border-top: 1px solid #d0d7de; padding-top: 12px;">`;
|
||||||
|
html += `<h4 style="margin: 0 0 12px 0;">📋 Device Logs (${logsData.stats.total} total)</h4>`;
|
||||||
|
|
||||||
|
// Stats summary
|
||||||
|
if (logsData.stats.by_level) {
|
||||||
|
html += `<div style="margin-bottom: 8px; font-size: 0.85em; color: #666;">`;
|
||||||
|
const levels = logsData.stats.by_level;
|
||||||
|
const parts = [];
|
||||||
|
if (levels.ERROR) parts.push(`<span style="color: #d00;">${levels.ERROR} errors</span>`);
|
||||||
|
if (levels.WARNING) parts.push(`<span style="color: #fa0;">${levels.WARNING} warnings</span>`);
|
||||||
|
if (levels.INFO) parts.push(`${levels.INFO} info`);
|
||||||
|
html += parts.join(' · ');
|
||||||
|
html += `</div>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Log entries (collapsible)
|
||||||
|
html += `<details open><summary style="cursor: pointer; font-size: 0.9em; margin-bottom: 8px;">Recent entries (${logsData.logs.length})</summary>`;
|
||||||
|
html += `<div style="max-height: 300px; overflow-y: auto; background: #f6f8fa; border: 1px solid #d0d7de; border-radius: 4px; padding: 8px; font-size: 0.8em; font-family: monospace;">`;
|
||||||
|
|
||||||
|
logsData.logs.forEach(entry => {
|
||||||
|
const levelColor = {
|
||||||
|
'ERROR': '#d00',
|
||||||
|
'WARNING': '#b86e00',
|
||||||
|
'INFO': '#0969da',
|
||||||
|
'DEBUG': '#888'
|
||||||
|
}[entry.level] || '#666';
|
||||||
|
|
||||||
|
const time = new Date(entry.timestamp).toLocaleString();
|
||||||
|
html += `<div style="margin-bottom: 4px; border-bottom: 1px solid #eee; padding-bottom: 4px;">`;
|
||||||
|
html += `<span style="color: #888;">${time}</span> `;
|
||||||
|
html += `<span style="color: ${levelColor}; font-weight: 600;">[${entry.level}]</span> `;
|
||||||
|
html += `<span style="color: #666;">[${entry.category}]</span> `;
|
||||||
|
html += `${entry.message}`;
|
||||||
|
html += `</div>`;
|
||||||
|
});
|
||||||
|
|
||||||
|
html += `</div></details>`;
|
||||||
|
html += `</div>`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (logErr) {
|
||||||
|
console.log('Could not fetch device logs:', logErr);
|
||||||
|
}
|
||||||
|
|
||||||
resultsEl.innerHTML = html;
|
resultsEl.innerHTML = html;
|
||||||
log(`Diagnostics complete: ${data.overall_status}`);
|
log(`Diagnostics complete: ${data.overall_status}`);
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,901 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8" />
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
<title>SLMM - Device Roster & Connections</title>
|
||||||
|
<style>
|
||||||
|
* { box-sizing: border-box; }
|
||||||
|
body {
|
||||||
|
font-family: system-ui, -apple-system, sans-serif;
|
||||||
|
margin: 0;
|
||||||
|
padding: 24px;
|
||||||
|
background: #f6f8fa;
|
||||||
|
}
|
||||||
|
.container { max-width: 1400px; margin: 0 auto; }
|
||||||
|
.header {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
margin-bottom: 24px;
|
||||||
|
padding: 16px;
|
||||||
|
background: white;
|
||||||
|
border-radius: 6px;
|
||||||
|
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
|
||||||
|
}
|
||||||
|
h1 { margin: 0; font-size: 24px; }
|
||||||
|
.nav { display: flex; gap: 12px; }
|
||||||
|
.btn {
|
||||||
|
padding: 8px 16px;
|
||||||
|
border: 1px solid #d0d7de;
|
||||||
|
background: white;
|
||||||
|
border-radius: 6px;
|
||||||
|
cursor: pointer;
|
||||||
|
text-decoration: none;
|
||||||
|
color: #24292f;
|
||||||
|
font-size: 14px;
|
||||||
|
transition: background 0.2s;
|
||||||
|
}
|
||||||
|
.btn:hover { background: #f6f8fa; }
|
||||||
|
.btn-primary {
|
||||||
|
background: #2da44e;
|
||||||
|
color: white;
|
||||||
|
border-color: #2da44e;
|
||||||
|
}
|
||||||
|
.btn-primary:hover { background: #2c974b; }
|
||||||
|
.btn-danger {
|
||||||
|
background: #cf222e;
|
||||||
|
color: white;
|
||||||
|
border-color: #cf222e;
|
||||||
|
}
|
||||||
|
.btn-danger:hover { background: #a40e26; }
|
||||||
|
.btn-small {
|
||||||
|
padding: 4px 8px;
|
||||||
|
font-size: 12px;
|
||||||
|
margin-right: 4px;
|
||||||
|
}
|
||||||
|
.table-container {
|
||||||
|
background: white;
|
||||||
|
border-radius: 6px;
|
||||||
|
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
|
||||||
|
overflow-x: auto;
|
||||||
|
}
|
||||||
|
table {
|
||||||
|
width: 100%;
|
||||||
|
border-collapse: collapse;
|
||||||
|
}
|
||||||
|
th {
|
||||||
|
background: #f6f8fa;
|
||||||
|
padding: 12px;
|
||||||
|
text-align: left;
|
||||||
|
font-weight: 600;
|
||||||
|
border-bottom: 2px solid #d0d7de;
|
||||||
|
font-size: 13px;
|
||||||
|
white-space: nowrap;
|
||||||
|
}
|
||||||
|
td {
|
||||||
|
padding: 12px;
|
||||||
|
border-bottom: 1px solid #d0d7de;
|
||||||
|
font-size: 13px;
|
||||||
|
}
|
||||||
|
tr:hover { background: #f6f8fa; }
|
||||||
|
.status-badge {
|
||||||
|
display: inline-block;
|
||||||
|
padding: 2px 8px;
|
||||||
|
border-radius: 12px;
|
||||||
|
font-size: 11px;
|
||||||
|
font-weight: 600;
|
||||||
|
text-transform: uppercase;
|
||||||
|
}
|
||||||
|
.status-ok {
|
||||||
|
background: #dafbe1;
|
||||||
|
color: #1a7f37;
|
||||||
|
}
|
||||||
|
.status-unknown {
|
||||||
|
background: #eaeef2;
|
||||||
|
color: #57606a;
|
||||||
|
}
|
||||||
|
.status-error {
|
||||||
|
background: #ffebe9;
|
||||||
|
color: #cf222e;
|
||||||
|
}
|
||||||
|
.checkbox-cell {
|
||||||
|
text-align: center;
|
||||||
|
width: 80px;
|
||||||
|
}
|
||||||
|
.checkbox-cell input[type="checkbox"] {
|
||||||
|
cursor: pointer;
|
||||||
|
width: 16px;
|
||||||
|
height: 16px;
|
||||||
|
}
|
||||||
|
.actions-cell {
|
||||||
|
white-space: nowrap;
|
||||||
|
width: 200px;
|
||||||
|
}
|
||||||
|
.empty-state {
|
||||||
|
text-align: center;
|
||||||
|
padding: 48px;
|
||||||
|
color: #57606a;
|
||||||
|
}
|
||||||
|
.empty-state-icon {
|
||||||
|
font-size: 48px;
|
||||||
|
margin-bottom: 16px;
|
||||||
|
}
|
||||||
|
.modal {
|
||||||
|
display: none;
|
||||||
|
position: fixed;
|
||||||
|
top: 0;
|
||||||
|
left: 0;
|
||||||
|
width: 100%;
|
||||||
|
height: 100%;
|
||||||
|
background: rgba(0,0,0,0.5);
|
||||||
|
z-index: 1000;
|
||||||
|
align-items: center;
|
||||||
|
justify-content: center;
|
||||||
|
}
|
||||||
|
.modal.active { display: flex; }
|
||||||
|
.modal-content {
|
||||||
|
background: white;
|
||||||
|
padding: 24px;
|
||||||
|
border-radius: 6px;
|
||||||
|
max-width: 600px;
|
||||||
|
width: 90%;
|
||||||
|
max-height: 80vh;
|
||||||
|
overflow-y: auto;
|
||||||
|
}
|
||||||
|
.modal-header {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
margin-bottom: 16px;
|
||||||
|
}
|
||||||
|
.modal-header h2 {
|
||||||
|
margin: 0;
|
||||||
|
font-size: 20px;
|
||||||
|
}
|
||||||
|
.close-btn {
|
||||||
|
background: none;
|
||||||
|
border: none;
|
||||||
|
font-size: 24px;
|
||||||
|
cursor: pointer;
|
||||||
|
color: #57606a;
|
||||||
|
padding: 0;
|
||||||
|
width: 32px;
|
||||||
|
height: 32px;
|
||||||
|
}
|
||||||
|
.close-btn:hover { color: #24292f; }
|
||||||
|
.form-group {
|
||||||
|
margin-bottom: 16px;
|
||||||
|
}
|
||||||
|
.form-group label {
|
||||||
|
display: block;
|
||||||
|
margin-bottom: 6px;
|
||||||
|
font-weight: 600;
|
||||||
|
font-size: 14px;
|
||||||
|
}
|
||||||
|
.form-group input[type="text"],
|
||||||
|
.form-group input[type="number"],
|
||||||
|
.form-group input[type="password"] {
|
||||||
|
width: 100%;
|
||||||
|
padding: 8px 12px;
|
||||||
|
border: 1px solid #d0d7de;
|
||||||
|
border-radius: 6px;
|
||||||
|
font-size: 14px;
|
||||||
|
}
|
||||||
|
.form-group input[type="checkbox"] {
|
||||||
|
width: auto;
|
||||||
|
margin-right: 8px;
|
||||||
|
}
|
||||||
|
.checkbox-label {
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
font-weight: normal;
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
.form-actions {
|
||||||
|
display: flex;
|
||||||
|
justify-content: flex-end;
|
||||||
|
gap: 8px;
|
||||||
|
margin-top: 24px;
|
||||||
|
}
|
||||||
|
.toast {
|
||||||
|
position: fixed;
|
||||||
|
top: 24px;
|
||||||
|
right: 24px;
|
||||||
|
padding: 12px 16px;
|
||||||
|
background: #24292f;
|
||||||
|
color: white;
|
||||||
|
border-radius: 6px;
|
||||||
|
box-shadow: 0 4px 12px rgba(0,0,0,0.15);
|
||||||
|
z-index: 2000;
|
||||||
|
display: none;
|
||||||
|
min-width: 300px;
|
||||||
|
}
|
||||||
|
.toast.active {
|
||||||
|
display: block;
|
||||||
|
animation: slideIn 0.3s ease-out;
|
||||||
|
}
|
||||||
|
@keyframes slideIn {
|
||||||
|
from {
|
||||||
|
transform: translateX(400px);
|
||||||
|
opacity: 0;
|
||||||
|
}
|
||||||
|
to {
|
||||||
|
transform: translateX(0);
|
||||||
|
opacity: 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.toast-success { background: #2da44e; }
|
||||||
|
.toast-error { background: #cf222e; }
|
||||||
|
|
||||||
|
/* Tabs */
|
||||||
|
.tabs {
|
||||||
|
display: flex;
|
||||||
|
gap: 0;
|
||||||
|
margin-bottom: 0;
|
||||||
|
border-bottom: 2px solid #d0d7de;
|
||||||
|
}
|
||||||
|
.tab-btn {
|
||||||
|
padding: 10px 20px;
|
||||||
|
border: none;
|
||||||
|
background: none;
|
||||||
|
cursor: pointer;
|
||||||
|
font-size: 14px;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #57606a;
|
||||||
|
border-bottom: 2px solid transparent;
|
||||||
|
margin-bottom: -2px;
|
||||||
|
transition: color 0.2s, border-color 0.2s;
|
||||||
|
}
|
||||||
|
.tab-btn:hover { color: #24292f; }
|
||||||
|
.tab-btn.active {
|
||||||
|
color: #24292f;
|
||||||
|
border-bottom-color: #fd8c73;
|
||||||
|
}
|
||||||
|
.tab-panel { display: none; }
|
||||||
|
.tab-panel.active { display: block; }
|
||||||
|
|
||||||
|
/* Connection pool panel */
|
||||||
|
.pool-config {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(auto-fill, minmax(180px, 1fr));
|
||||||
|
gap: 12px;
|
||||||
|
margin-bottom: 20px;
|
||||||
|
}
|
||||||
|
.pool-config-card {
|
||||||
|
background: #f6f8fa;
|
||||||
|
border: 1px solid #d0d7de;
|
||||||
|
border-radius: 6px;
|
||||||
|
padding: 12px;
|
||||||
|
}
|
||||||
|
.pool-config-card .label {
|
||||||
|
font-size: 11px;
|
||||||
|
color: #57606a;
|
||||||
|
text-transform: uppercase;
|
||||||
|
font-weight: 600;
|
||||||
|
margin-bottom: 4px;
|
||||||
|
}
|
||||||
|
.pool-config-card .value {
|
||||||
|
font-size: 18px;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #24292f;
|
||||||
|
}
|
||||||
|
.conn-card {
|
||||||
|
background: white;
|
||||||
|
border: 1px solid #d0d7de;
|
||||||
|
border-radius: 6px;
|
||||||
|
padding: 16px;
|
||||||
|
margin-bottom: 12px;
|
||||||
|
}
|
||||||
|
.conn-card-header {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
margin-bottom: 12px;
|
||||||
|
}
|
||||||
|
.conn-card-header strong { font-size: 15px; }
|
||||||
|
.conn-card-grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(auto-fill, minmax(140px, 1fr));
|
||||||
|
gap: 8px;
|
||||||
|
}
|
||||||
|
.conn-stat .label {
|
||||||
|
font-size: 11px;
|
||||||
|
color: #57606a;
|
||||||
|
text-transform: uppercase;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
.conn-stat .value {
|
||||||
|
font-size: 14px;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #24292f;
|
||||||
|
}
|
||||||
|
.conn-empty {
|
||||||
|
text-align: center;
|
||||||
|
padding: 32px;
|
||||||
|
color: #57606a;
|
||||||
|
}
|
||||||
|
.pool-actions {
|
||||||
|
display: flex;
|
||||||
|
gap: 8px;
|
||||||
|
margin-bottom: 16px;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="container">
|
||||||
|
<div class="header">
|
||||||
|
<h1>SLMM - Roster & Connections</h1>
|
||||||
|
<div class="nav">
|
||||||
|
<a href="/" class="btn">← Back to Control Panel</a>
|
||||||
|
<button class="btn btn-primary" onclick="openAddModal()">+ Add Device</button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="tabs">
|
||||||
|
<button class="tab-btn active" onclick="switchTab('roster')">Device Roster</button>
|
||||||
|
<button class="tab-btn" onclick="switchTab('connections')">Connections</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Roster Tab -->
|
||||||
|
<div id="tab-roster" class="tab-panel active">
|
||||||
|
<div class="table-container" style="border-top-left-radius: 0; border-top-right-radius: 0;">
|
||||||
|
<table id="rosterTable">
|
||||||
|
<thead>
|
||||||
|
<tr>
|
||||||
|
<th>Unit ID</th>
|
||||||
|
<th>Host / IP</th>
|
||||||
|
<th>TCP Port</th>
|
||||||
|
<th>FTP Port</th>
|
||||||
|
<th class="checkbox-cell">TCP</th>
|
||||||
|
<th class="checkbox-cell">FTP</th>
|
||||||
|
<th class="checkbox-cell">Polling</th>
|
||||||
|
<th>Status</th>
|
||||||
|
<th class="actions-cell">Actions</th>
|
||||||
|
</tr>
|
||||||
|
</thead>
|
||||||
|
<tbody id="rosterBody">
|
||||||
|
<tr>
|
||||||
|
<td colspan="9" style="text-align: center; padding: 24px;">
|
||||||
|
Loading...
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Connections Tab -->
|
||||||
|
<div id="tab-connections" class="tab-panel">
|
||||||
|
<div class="table-container" style="padding: 20px; border-top-left-radius: 0; border-top-right-radius: 0;">
|
||||||
|
<div class="pool-actions">
|
||||||
|
<button class="btn" onclick="loadConnections()">Refresh</button>
|
||||||
|
<button class="btn btn-danger" onclick="flushConnections()">Flush All Connections</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h3 style="margin: 0 0 12px 0; font-size: 16px;">Pool Configuration</h3>
|
||||||
|
<div id="poolConfig" class="pool-config">
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">Status</div>
|
||||||
|
<div class="value" id="poolEnabled">--</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h3 style="margin: 20px 0 12px 0; font-size: 16px;">Active Connections</h3>
|
||||||
|
<div id="connectionsList">
|
||||||
|
<div class="conn-empty">Loading...</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Add/Edit Modal -->
|
||||||
|
<div id="deviceModal" class="modal">
|
||||||
|
<div class="modal-content">
|
||||||
|
<div class="modal-header">
|
||||||
|
<h2 id="modalTitle">Add Device</h2>
|
||||||
|
<button class="close-btn" onclick="closeModal()">×</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<form id="deviceForm" onsubmit="saveDevice(event)">
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="unitId">Unit ID *</label>
|
||||||
|
<input type="text" id="unitId" required placeholder="e.g., nl43-1, slm-site-a" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="host">Host / IP Address *</label>
|
||||||
|
<input type="text" id="host" required placeholder="e.g., 192.168.1.100" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="tcpPort">TCP Port *</label>
|
||||||
|
<input type="number" id="tcpPort" required value="2255" min="1" max="65535" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="ftpPort">FTP Port</label>
|
||||||
|
<input type="number" id="ftpPort" value="21" min="1" max="65535" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label class="checkbox-label">
|
||||||
|
<input type="checkbox" id="tcpEnabled" checked />
|
||||||
|
TCP Enabled (required for remote control)
|
||||||
|
</label>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label class="checkbox-label">
|
||||||
|
<input type="checkbox" id="ftpEnabled" onchange="toggleFtpCredentials()" />
|
||||||
|
FTP Enabled (for file downloads)
|
||||||
|
</label>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div id="ftpCredentialsSection" style="display: none; padding: 12px; background: #f6f8fa; border-radius: 6px; margin-bottom: 16px;">
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="ftpUsername">FTP Username</label>
|
||||||
|
<input type="text" id="ftpUsername" placeholder="Default: USER" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="ftpPassword">FTP Password</label>
|
||||||
|
<input type="password" id="ftpPassword" placeholder="Default: 0000" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label class="checkbox-label">
|
||||||
|
<input type="checkbox" id="pollEnabled" checked />
|
||||||
|
Enable background polling (status updates)
|
||||||
|
</label>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="pollInterval">Polling Interval (seconds)</label>
|
||||||
|
<input type="number" id="pollInterval" value="60" min="10" max="3600" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="form-actions">
|
||||||
|
<button type="button" class="btn" onclick="closeModal()">Cancel</button>
|
||||||
|
<button type="submit" class="btn btn-primary">Save Device</button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Toast Notification -->
|
||||||
|
<div id="toast" class="toast"></div>
|
||||||
|
|
||||||
|
<script>
|
||||||
|
let devices = [];
|
||||||
|
let editingDeviceId = null;
|
||||||
|
|
||||||
|
// Load roster on page load
|
||||||
|
document.addEventListener('DOMContentLoaded', () => {
|
||||||
|
loadRoster();
|
||||||
|
});
|
||||||
|
|
||||||
|
async function loadRoster() {
|
||||||
|
try {
|
||||||
|
const res = await fetch('/api/nl43/roster');
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast('Failed to load roster', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
devices = data.devices || [];
|
||||||
|
renderRoster();
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error loading roster: ' + err.message, 'error');
|
||||||
|
console.error('Load roster error:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderRoster() {
|
||||||
|
const tbody = document.getElementById('rosterBody');
|
||||||
|
|
||||||
|
if (devices.length === 0) {
|
||||||
|
tbody.innerHTML = `
|
||||||
|
<tr>
|
||||||
|
<td colspan="9" class="empty-state">
|
||||||
|
<div class="empty-state-icon">📭</div>
|
||||||
|
<div><strong>No devices configured</strong></div>
|
||||||
|
<div style="margin-top: 8px; font-size: 14px;">Click "Add Device" to configure your first sound level meter</div>
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
`;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
tbody.innerHTML = devices.map(device => `
|
||||||
|
<tr>
|
||||||
|
<td><strong>${escapeHtml(device.unit_id)}</strong></td>
|
||||||
|
<td>${escapeHtml(device.host)}</td>
|
||||||
|
<td>${device.tcp_port}</td>
|
||||||
|
<td>${device.ftp_port || 21}</td>
|
||||||
|
<td class="checkbox-cell">
|
||||||
|
<input type="checkbox" ${device.tcp_enabled ? 'checked' : ''} disabled />
|
||||||
|
</td>
|
||||||
|
<td class="checkbox-cell">
|
||||||
|
<input type="checkbox" ${device.ftp_enabled ? 'checked' : ''} disabled />
|
||||||
|
</td>
|
||||||
|
<td class="checkbox-cell">
|
||||||
|
<input type="checkbox" ${device.poll_enabled ? 'checked' : ''} disabled />
|
||||||
|
</td>
|
||||||
|
<td>
|
||||||
|
${getStatusBadge(device)}
|
||||||
|
</td>
|
||||||
|
<td class="actions-cell">
|
||||||
|
<button class="btn btn-small" onclick="testDevice('${escapeHtml(device.unit_id)}')">Test</button>
|
||||||
|
<button class="btn btn-small" onclick="openEditModal('${escapeHtml(device.unit_id)}')">Edit</button>
|
||||||
|
<button class="btn btn-small btn-danger" onclick="deleteDevice('${escapeHtml(device.unit_id)}')">Delete</button>
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
`).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
function getStatusBadge(device) {
|
||||||
|
if (!device.status) {
|
||||||
|
return '<span class="status-badge status-unknown">Unknown</span>';
|
||||||
|
}
|
||||||
|
|
||||||
|
if (device.status.is_reachable === false) {
|
||||||
|
return '<span class="status-badge status-error">Offline</span>';
|
||||||
|
}
|
||||||
|
|
||||||
|
if (device.status.last_success) {
|
||||||
|
const lastSeen = new Date(device.status.last_success);
|
||||||
|
const ago = Math.floor((Date.now() - lastSeen) / 1000);
|
||||||
|
if (ago < 300) { // Less than 5 minutes
|
||||||
|
return '<span class="status-badge status-ok">Online</span>';
|
||||||
|
} else {
|
||||||
|
return `<span class="status-badge status-unknown">Stale (${Math.floor(ago / 60)}m ago)</span>`;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return '<span class="status-badge status-unknown">Unknown</span>';
|
||||||
|
}
|
||||||
|
|
||||||
|
function escapeHtml(text) {
|
||||||
|
const map = {
|
||||||
|
'&': '&',
|
||||||
|
'<': '<',
|
||||||
|
'>': '>',
|
||||||
|
'"': '"',
|
||||||
|
"'": '''
|
||||||
|
};
|
||||||
|
return String(text).replace(/[&<>"']/g, m => map[m]);
|
||||||
|
}
|
||||||
|
|
||||||
|
function openAddModal() {
|
||||||
|
editingDeviceId = null;
|
||||||
|
document.getElementById('modalTitle').textContent = 'Add Device';
|
||||||
|
document.getElementById('deviceForm').reset();
|
||||||
|
document.getElementById('unitId').disabled = false;
|
||||||
|
document.getElementById('tcpEnabled').checked = true;
|
||||||
|
document.getElementById('ftpEnabled').checked = false;
|
||||||
|
document.getElementById('pollEnabled').checked = true;
|
||||||
|
document.getElementById('tcpPort').value = 2255;
|
||||||
|
document.getElementById('ftpPort').value = 21;
|
||||||
|
document.getElementById('pollInterval').value = 60;
|
||||||
|
toggleFtpCredentials();
|
||||||
|
document.getElementById('deviceModal').classList.add('active');
|
||||||
|
}
|
||||||
|
|
||||||
|
function openEditModal(unitId) {
|
||||||
|
const device = devices.find(d => d.unit_id === unitId);
|
||||||
|
if (!device) {
|
||||||
|
showToast('Device not found', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
editingDeviceId = unitId;
|
||||||
|
document.getElementById('modalTitle').textContent = 'Edit Device';
|
||||||
|
document.getElementById('unitId').value = device.unit_id;
|
||||||
|
document.getElementById('unitId').disabled = true;
|
||||||
|
document.getElementById('host').value = device.host;
|
||||||
|
document.getElementById('tcpPort').value = device.tcp_port;
|
||||||
|
document.getElementById('ftpPort').value = device.ftp_port || 21;
|
||||||
|
document.getElementById('tcpEnabled').checked = device.tcp_enabled;
|
||||||
|
document.getElementById('ftpEnabled').checked = device.ftp_enabled;
|
||||||
|
document.getElementById('ftpUsername').value = device.ftp_username || '';
|
||||||
|
document.getElementById('ftpPassword').value = device.ftp_password || '';
|
||||||
|
document.getElementById('pollEnabled').checked = device.poll_enabled;
|
||||||
|
document.getElementById('pollInterval').value = device.poll_interval_seconds || 60;
|
||||||
|
toggleFtpCredentials();
|
||||||
|
document.getElementById('deviceModal').classList.add('active');
|
||||||
|
}
|
||||||
|
|
||||||
|
function closeModal() {
|
||||||
|
document.getElementById('deviceModal').classList.remove('active');
|
||||||
|
editingDeviceId = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function toggleFtpCredentials() {
|
||||||
|
const ftpEnabled = document.getElementById('ftpEnabled').checked;
|
||||||
|
document.getElementById('ftpCredentialsSection').style.display = ftpEnabled ? 'block' : 'none';
|
||||||
|
}
|
||||||
|
|
||||||
|
async function saveDevice(event) {
|
||||||
|
event.preventDefault();
|
||||||
|
|
||||||
|
const unitId = document.getElementById('unitId').value.trim();
|
||||||
|
const payload = {
|
||||||
|
host: document.getElementById('host').value.trim(),
|
||||||
|
tcp_port: parseInt(document.getElementById('tcpPort').value),
|
||||||
|
ftp_port: parseInt(document.getElementById('ftpPort').value),
|
||||||
|
tcp_enabled: document.getElementById('tcpEnabled').checked,
|
||||||
|
ftp_enabled: document.getElementById('ftpEnabled').checked,
|
||||||
|
poll_enabled: document.getElementById('pollEnabled').checked,
|
||||||
|
poll_interval_seconds: parseInt(document.getElementById('pollInterval').value)
|
||||||
|
};
|
||||||
|
|
||||||
|
if (payload.ftp_enabled) {
|
||||||
|
const username = document.getElementById('ftpUsername').value.trim();
|
||||||
|
const password = document.getElementById('ftpPassword').value.trim();
|
||||||
|
if (username) payload.ftp_username = username;
|
||||||
|
if (password) payload.ftp_password = password;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
const url = editingDeviceId
|
||||||
|
? `/api/nl43/${editingDeviceId}/config`
|
||||||
|
: `/api/nl43/roster`;
|
||||||
|
|
||||||
|
const method = editingDeviceId ? 'PUT' : 'POST';
|
||||||
|
|
||||||
|
const body = editingDeviceId
|
||||||
|
? payload
|
||||||
|
: { unit_id: unitId, ...payload };
|
||||||
|
|
||||||
|
const res = await fetch(url, {
|
||||||
|
method,
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify(body)
|
||||||
|
});
|
||||||
|
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast(data.detail || 'Failed to save device', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
showToast(editingDeviceId ? 'Device updated successfully' : 'Device added successfully', 'success');
|
||||||
|
closeModal();
|
||||||
|
await loadRoster();
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error saving device: ' + err.message, 'error');
|
||||||
|
console.error('Save device error:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function deleteDevice(unitId) {
|
||||||
|
if (!confirm(`Are you sure you want to delete "${unitId}"?\n\nThis will remove the device configuration but will not affect the physical device.`)) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
const res = await fetch(`/api/nl43/${unitId}/config`, {
|
||||||
|
method: 'DELETE'
|
||||||
|
});
|
||||||
|
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast(data.detail || 'Failed to delete device', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
showToast('Device deleted successfully', 'success');
|
||||||
|
await loadRoster();
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error deleting device: ' + err.message, 'error');
|
||||||
|
console.error('Delete device error:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function testDevice(unitId) {
|
||||||
|
showToast('Testing device connection...', 'success');
|
||||||
|
|
||||||
|
try {
|
||||||
|
const res = await fetch(`/api/nl43/${unitId}/diagnostics`);
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast('Device test failed', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const statusText = {
|
||||||
|
'pass': 'All systems operational ✓',
|
||||||
|
'fail': 'Connection failed ✗',
|
||||||
|
'degraded': 'Partial connectivity ⚠'
|
||||||
|
};
|
||||||
|
|
||||||
|
showToast(statusText[data.overall_status] || 'Test complete',
|
||||||
|
data.overall_status === 'pass' ? 'success' : 'error');
|
||||||
|
|
||||||
|
// Reload to update status
|
||||||
|
await loadRoster();
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error testing device: ' + err.message, 'error');
|
||||||
|
console.error('Test device error:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function showToast(message, type = 'success') {
|
||||||
|
const toast = document.getElementById('toast');
|
||||||
|
toast.textContent = message;
|
||||||
|
toast.className = `toast toast-${type} active`;
|
||||||
|
|
||||||
|
setTimeout(() => {
|
||||||
|
toast.classList.remove('active');
|
||||||
|
}, 3000);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Close modal when clicking outside
|
||||||
|
document.getElementById('deviceModal').addEventListener('click', (e) => {
|
||||||
|
if (e.target.id === 'deviceModal') {
|
||||||
|
closeModal();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// ========== Tab Switching ==========
|
||||||
|
|
||||||
|
function switchTab(tabName) {
|
||||||
|
document.querySelectorAll('.tab-btn').forEach(btn => btn.classList.remove('active'));
|
||||||
|
document.querySelectorAll('.tab-panel').forEach(panel => panel.classList.remove('active'));
|
||||||
|
|
||||||
|
document.querySelector(`.tab-btn[onclick="switchTab('${tabName}')"]`).classList.add('active');
|
||||||
|
document.getElementById(`tab-${tabName}`).classList.add('active');
|
||||||
|
|
||||||
|
if (tabName === 'connections') {
|
||||||
|
loadConnections();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ========== Connection Pool ==========
|
||||||
|
|
||||||
|
let connectionsRefreshTimer = null;
|
||||||
|
|
||||||
|
async function loadConnections() {
|
||||||
|
try {
|
||||||
|
const res = await fetch('/api/nl43/_connections/status');
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast('Failed to load connection pool status', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const pool = data.pool;
|
||||||
|
renderPoolConfig(pool);
|
||||||
|
renderConnections(pool.connections);
|
||||||
|
|
||||||
|
// Auto-refresh while tab is active
|
||||||
|
clearTimeout(connectionsRefreshTimer);
|
||||||
|
if (document.getElementById('tab-connections').classList.contains('active')) {
|
||||||
|
connectionsRefreshTimer = setTimeout(loadConnections, 5000);
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error loading connections: ' + err.message, 'error');
|
||||||
|
console.error('Load connections error:', err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderPoolConfig(pool) {
|
||||||
|
document.getElementById('poolConfig').innerHTML = `
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">Persistent</div>
|
||||||
|
<div class="value" style="color: ${pool.enabled ? '#1a7f37' : '#cf222e'}">${pool.enabled ? 'Enabled' : 'Disabled'}</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">Active</div>
|
||||||
|
<div class="value">${pool.active_connections}</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">Idle TTL</div>
|
||||||
|
<div class="value">${pool.idle_ttl}s</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">Max Age</div>
|
||||||
|
<div class="value">${pool.max_age}s</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">KA Idle</div>
|
||||||
|
<div class="value">${pool.keepalive_idle}s</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">KA Interval</div>
|
||||||
|
<div class="value">${pool.keepalive_interval}s</div>
|
||||||
|
</div>
|
||||||
|
<div class="pool-config-card">
|
||||||
|
<div class="label">KA Probes</div>
|
||||||
|
<div class="value">${pool.keepalive_count}</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderConnections(connections) {
|
||||||
|
const container = document.getElementById('connectionsList');
|
||||||
|
const keys = Object.keys(connections);
|
||||||
|
|
||||||
|
if (keys.length === 0) {
|
||||||
|
container.innerHTML = `
|
||||||
|
<div class="conn-empty">
|
||||||
|
<div style="font-size: 32px; margin-bottom: 8px;">~</div>
|
||||||
|
<div><strong>No active connections</strong></div>
|
||||||
|
<div style="margin-top: 4px; font-size: 13px;">
|
||||||
|
Connections appear here when devices are actively being polled and the connection is cached between commands.
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
container.innerHTML = keys.map(key => {
|
||||||
|
const conn = connections[key];
|
||||||
|
const aliveColor = conn.alive ? '#1a7f37' : '#cf222e';
|
||||||
|
const aliveText = conn.alive ? 'Alive' : 'Stale';
|
||||||
|
return `
|
||||||
|
<div class="conn-card">
|
||||||
|
<div class="conn-card-header">
|
||||||
|
<strong>${escapeHtml(key)}</strong>
|
||||||
|
<span class="status-badge ${conn.alive ? 'status-ok' : 'status-error'}">${aliveText}</span>
|
||||||
|
</div>
|
||||||
|
<div class="conn-card-grid">
|
||||||
|
<div class="conn-stat">
|
||||||
|
<div class="label">Host</div>
|
||||||
|
<div class="value">${escapeHtml(conn.host)}</div>
|
||||||
|
</div>
|
||||||
|
<div class="conn-stat">
|
||||||
|
<div class="label">Port</div>
|
||||||
|
<div class="value">${conn.port}</div>
|
||||||
|
</div>
|
||||||
|
<div class="conn-stat">
|
||||||
|
<div class="label">Age</div>
|
||||||
|
<div class="value">${formatSeconds(conn.age_seconds)}</div>
|
||||||
|
</div>
|
||||||
|
<div class="conn-stat">
|
||||||
|
<div class="label">Idle</div>
|
||||||
|
<div class="value">${formatSeconds(conn.idle_seconds)}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
`;
|
||||||
|
}).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatSeconds(s) {
|
||||||
|
if (s < 60) return Math.round(s) + 's';
|
||||||
|
if (s < 3600) return Math.floor(s / 60) + 'm ' + Math.round(s % 60) + 's';
|
||||||
|
return Math.floor(s / 3600) + 'h ' + Math.floor((s % 3600) / 60) + 'm';
|
||||||
|
}
|
||||||
|
|
||||||
|
async function flushConnections() {
|
||||||
|
if (!confirm('Close all cached TCP connections?\n\nDevices will reconnect on the next poll cycle.')) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
const res = await fetch('/api/nl43/_connections/flush', { method: 'POST' });
|
||||||
|
const data = await res.json();
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
showToast(data.detail || 'Failed to flush connections', 'error');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
showToast('All connections flushed', 'success');
|
||||||
|
await loadConnections();
|
||||||
|
} catch (err) {
|
||||||
|
showToast('Error flushing connections: ' + err.message, 'error');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
@@ -0,0 +1,68 @@
|
|||||||
|
"""
|
||||||
|
Synthetic unit test for the alert state machine — no DB, no device.
|
||||||
|
|
||||||
|
Drives `_evaluate_step` with a fake clock + a level series and checks that
|
||||||
|
onset/clear fire with the right debounce + hysteresis. Run:
|
||||||
|
|
||||||
|
docker compose exec -T slmm python3 test_alert_evaluator.py
|
||||||
|
# or, if app.alerts imports cleanly standalone: python3 test_alert_evaluator.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
from types import SimpleNamespace
|
||||||
|
from app.alerts import RuleState, _evaluate_step
|
||||||
|
|
||||||
|
|
||||||
|
def rule(**kw):
|
||||||
|
base = dict(threshold_db=85.0, duration_s=3, clear_margin_db=2.0, comparison="above")
|
||||||
|
base.update(kw)
|
||||||
|
return SimpleNamespace(**base)
|
||||||
|
|
||||||
|
|
||||||
|
def run(series, r):
|
||||||
|
st = RuleState()
|
||||||
|
events = [(now, a) for value, now in series
|
||||||
|
if (a := _evaluate_step(st, value, now, r))]
|
||||||
|
return events, st
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
failures = 0
|
||||||
|
|
||||||
|
def check(label, cond, detail=""):
|
||||||
|
nonlocal failures
|
||||||
|
print(("PASS" if cond else "FAIL"), label, detail)
|
||||||
|
if not cond:
|
||||||
|
failures += 1
|
||||||
|
|
||||||
|
# 1) sustained exceedance -> onset after duration; recovery -> clear after duration
|
||||||
|
r = rule(threshold_db=85, duration_s=3, clear_margin_db=2)
|
||||||
|
ev, _ = run([(80, 0), (86, 1), (87, 2), (88, 3), (88, 4),
|
||||||
|
(88, 5), (82, 6), (82, 7), (82, 8), (82, 9)], r)
|
||||||
|
onsets = [t for t, a in ev if a == "onset"]
|
||||||
|
clears = [t for t, a in ev if a == "clear"]
|
||||||
|
check("1 sustained onset@4 / clear@9", onsets == [4] and clears == [9], str(ev))
|
||||||
|
|
||||||
|
# 2) brief spike under duration -> no onset (debounce)
|
||||||
|
ev, _ = run([(80, 0), (90, 1), (90, 2), (80, 3), (80, 4)], rule(duration_s=3))
|
||||||
|
check("2 brief spike debounced", ev == [], str(ev))
|
||||||
|
|
||||||
|
# 3) hysteresis: a dip into the margin (below threshold, above threshold-margin)
|
||||||
|
# does NOT clear
|
||||||
|
r = rule(threshold_db=85, duration_s=0, clear_margin_db=3)
|
||||||
|
ev, st = run([(86, 0), (84, 1), (84, 2), (84, 3)], r)
|
||||||
|
check("3 hysteresis holds ACTIVE", ev == [(0, "onset")] and st.phase == "active",
|
||||||
|
f"{ev} phase={st.phase}")
|
||||||
|
|
||||||
|
# 4) 'below' comparison (device too quiet) -> onset when value < threshold
|
||||||
|
ev, _ = run([(30, 0), (15, 1)], rule(threshold_db=20, duration_s=0,
|
||||||
|
clear_margin_db=2, comparison="below"))
|
||||||
|
check("4 below-comparison onset@1", ev == [(1, "onset")], str(ev))
|
||||||
|
|
||||||
|
print()
|
||||||
|
print("ALL PASS" if failures == 0 else f"{failures} FAILURE(S)")
|
||||||
|
return failures
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import sys
|
||||||
|
sys.exit(1 if main() else 0)
|
||||||
Executable
+167
@@ -0,0 +1,167 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Manual test script for background polling functionality
|
||||||
|
# Usage: ./test_polling.sh [UNIT_ID]
|
||||||
|
|
||||||
|
BASE_URL="http://localhost:8100/api/nl43"
|
||||||
|
UNIT_ID="${1:-NL43-001}"
|
||||||
|
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Background Polling Test Script"
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Testing device: $UNIT_ID"
|
||||||
|
echo "Base URL: $BASE_URL"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Color codes for output
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
RED='\033[0;31m'
|
||||||
|
NC='\033[0m' # No Color
|
||||||
|
|
||||||
|
# Function to print test header
|
||||||
|
test_header() {
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
echo "$1"
|
||||||
|
echo "=========================================="
|
||||||
|
}
|
||||||
|
|
||||||
|
# Function to print success
|
||||||
|
success() {
|
||||||
|
echo -e "${GREEN}✓${NC} $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Function to print warning
|
||||||
|
warning() {
|
||||||
|
echo -e "${YELLOW}⚠${NC} $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Function to print error
|
||||||
|
error() {
|
||||||
|
echo -e "${RED}✗${NC} $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Test 1: Get current polling configuration
|
||||||
|
test_header "Test 1: Get Current Polling Configuration"
|
||||||
|
RESPONSE=$(curl -s "$BASE_URL/$UNIT_ID/polling/config")
|
||||||
|
echo "$RESPONSE" | jq '.'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
success "Successfully retrieved polling configuration"
|
||||||
|
CURRENT_INTERVAL=$(echo "$RESPONSE" | jq -r '.data.poll_interval_seconds')
|
||||||
|
CURRENT_ENABLED=$(echo "$RESPONSE" | jq -r '.data.poll_enabled')
|
||||||
|
echo " Current interval: ${CURRENT_INTERVAL}s"
|
||||||
|
echo " Polling enabled: $CURRENT_ENABLED"
|
||||||
|
else
|
||||||
|
error "Failed to retrieve polling configuration"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 2: Update polling interval to 30 seconds
|
||||||
|
test_header "Test 2: Update Polling Interval to 30 Seconds"
|
||||||
|
RESPONSE=$(curl -s -X PUT "$BASE_URL/$UNIT_ID/polling/config" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"poll_interval_seconds": 30}')
|
||||||
|
echo "$RESPONSE" | jq '.'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
success "Successfully updated polling interval to 30s"
|
||||||
|
else
|
||||||
|
error "Failed to update polling interval"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 3: Check global polling status
|
||||||
|
test_header "Test 3: Check Global Polling Status"
|
||||||
|
RESPONSE=$(curl -s "$BASE_URL/_polling/status")
|
||||||
|
echo "$RESPONSE" | jq '.'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
success "Successfully retrieved global polling status"
|
||||||
|
POLLER_RUNNING=$(echo "$RESPONSE" | jq -r '.data.poller_running')
|
||||||
|
TOTAL_DEVICES=$(echo "$RESPONSE" | jq -r '.data.total_devices')
|
||||||
|
echo " Poller running: $POLLER_RUNNING"
|
||||||
|
echo " Total devices: $TOTAL_DEVICES"
|
||||||
|
else
|
||||||
|
error "Failed to retrieve global polling status"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 4: Wait for automatic poll to occur
|
||||||
|
test_header "Test 4: Wait for Automatic Poll (35 seconds)"
|
||||||
|
warning "Waiting 35 seconds for automatic poll to occur..."
|
||||||
|
for i in {35..1}; do
|
||||||
|
echo -ne " ${i}s remaining...\r"
|
||||||
|
sleep 1
|
||||||
|
done
|
||||||
|
echo ""
|
||||||
|
success "Wait complete"
|
||||||
|
|
||||||
|
# Test 5: Check if status was updated by background poller
|
||||||
|
test_header "Test 5: Verify Background Poll Occurred"
|
||||||
|
RESPONSE=$(curl -s "$BASE_URL/$UNIT_ID/status")
|
||||||
|
echo "$RESPONSE" | jq '{last_poll_attempt, last_success, is_reachable, consecutive_failures}'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
LAST_POLL=$(echo "$RESPONSE" | jq -r '.data.last_poll_attempt')
|
||||||
|
IS_REACHABLE=$(echo "$RESPONSE" | jq -r '.data.is_reachable')
|
||||||
|
FAILURES=$(echo "$RESPONSE" | jq -r '.data.consecutive_failures')
|
||||||
|
|
||||||
|
if [ "$LAST_POLL" != "null" ]; then
|
||||||
|
success "Device was polled by background poller"
|
||||||
|
echo " Last poll: $LAST_POLL"
|
||||||
|
echo " Reachable: $IS_REACHABLE"
|
||||||
|
echo " Failures: $FAILURES"
|
||||||
|
else
|
||||||
|
warning "No automatic poll detected yet"
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
error "Failed to retrieve device status"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 6: Disable polling
|
||||||
|
test_header "Test 6: Disable Background Polling"
|
||||||
|
RESPONSE=$(curl -s -X PUT "$BASE_URL/$UNIT_ID/polling/config" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"poll_enabled": false}')
|
||||||
|
echo "$RESPONSE" | jq '.'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
success "Successfully disabled background polling"
|
||||||
|
else
|
||||||
|
error "Failed to disable polling"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 7: Verify polling is disabled
|
||||||
|
test_header "Test 7: Verify Polling Disabled in Global Status"
|
||||||
|
RESPONSE=$(curl -s "$BASE_URL/_polling/status")
|
||||||
|
DEVICE_ENABLED=$(echo "$RESPONSE" | jq --arg uid "$UNIT_ID" '.data.devices[] | select(.unit_id == $uid) | .poll_enabled')
|
||||||
|
|
||||||
|
if [ "$DEVICE_ENABLED" == "false" ]; then
|
||||||
|
success "Polling correctly shows as disabled for $UNIT_ID"
|
||||||
|
else
|
||||||
|
warning "Device still appears in polling list or shows as enabled"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test 8: Re-enable polling with original interval
|
||||||
|
test_header "Test 8: Re-enable Polling with Original Interval"
|
||||||
|
RESPONSE=$(curl -s -X PUT "$BASE_URL/$UNIT_ID/polling/config" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d "{\"poll_enabled\": true, \"poll_interval_seconds\": $CURRENT_INTERVAL}")
|
||||||
|
echo "$RESPONSE" | jq '.'
|
||||||
|
|
||||||
|
if echo "$RESPONSE" | jq -e '.status == "ok"' > /dev/null; then
|
||||||
|
success "Successfully re-enabled polling with ${CURRENT_INTERVAL}s interval"
|
||||||
|
else
|
||||||
|
error "Failed to re-enable polling"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
test_header "Test Summary"
|
||||||
|
echo "All tests completed!"
|
||||||
|
echo ""
|
||||||
|
echo "Key endpoints tested:"
|
||||||
|
echo " GET $BASE_URL/{unit_id}/polling/config"
|
||||||
|
echo " PUT $BASE_URL/{unit_id}/polling/config"
|
||||||
|
echo " GET $BASE_URL/_polling/status"
|
||||||
|
echo " GET $BASE_URL/{unit_id}/status (with polling fields)"
|
||||||
|
echo ""
|
||||||
|
success "Background polling feature is working correctly"
|
||||||
@@ -1,128 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Test script to verify that sleep mode is automatically disabled when:
|
|
||||||
1. Device configuration is created/updated with TCP enabled
|
|
||||||
2. Measurements are started
|
|
||||||
|
|
||||||
This script tests the API endpoints, not the actual device communication.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import requests
|
|
||||||
import json
|
|
||||||
|
|
||||||
BASE_URL = "http://localhost:8100/api/nl43"
|
|
||||||
UNIT_ID = "test-nl43-001"
|
|
||||||
|
|
||||||
def test_config_update():
|
|
||||||
"""Test that config update works (actual sleep mode disable requires real device)"""
|
|
||||||
print("\n=== Testing Config Update ===")
|
|
||||||
|
|
||||||
# Create/update a device config
|
|
||||||
config_data = {
|
|
||||||
"host": "192.168.1.100",
|
|
||||||
"tcp_port": 2255,
|
|
||||||
"tcp_enabled": True,
|
|
||||||
"ftp_enabled": False,
|
|
||||||
"ftp_username": "admin",
|
|
||||||
"ftp_password": "password"
|
|
||||||
}
|
|
||||||
|
|
||||||
print(f"Updating config for {UNIT_ID}...")
|
|
||||||
response = requests.put(f"{BASE_URL}/{UNIT_ID}/config", json=config_data)
|
|
||||||
|
|
||||||
if response.status_code == 200:
|
|
||||||
print("✓ Config updated successfully")
|
|
||||||
print(f"Response: {json.dumps(response.json(), indent=2)}")
|
|
||||||
print("\nNote: Sleep mode disable was attempted (will succeed if device is reachable)")
|
|
||||||
return True
|
|
||||||
else:
|
|
||||||
print(f"✗ Config update failed: {response.status_code}")
|
|
||||||
print(f"Error: {response.text}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def test_get_config():
|
|
||||||
"""Test retrieving the config"""
|
|
||||||
print("\n=== Testing Get Config ===")
|
|
||||||
|
|
||||||
response = requests.get(f"{BASE_URL}/{UNIT_ID}/config")
|
|
||||||
|
|
||||||
if response.status_code == 200:
|
|
||||||
print("✓ Config retrieved successfully")
|
|
||||||
print(f"Response: {json.dumps(response.json(), indent=2)}")
|
|
||||||
return True
|
|
||||||
elif response.status_code == 404:
|
|
||||||
print("✗ Config not found (create one first)")
|
|
||||||
return False
|
|
||||||
else:
|
|
||||||
print(f"✗ Request failed: {response.status_code}")
|
|
||||||
print(f"Error: {response.text}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def test_start_measurement():
|
|
||||||
"""Test that start measurement attempts to disable sleep mode"""
|
|
||||||
print("\n=== Testing Start Measurement ===")
|
|
||||||
|
|
||||||
print(f"Attempting to start measurement on {UNIT_ID}...")
|
|
||||||
response = requests.post(f"{BASE_URL}/{UNIT_ID}/start")
|
|
||||||
|
|
||||||
if response.status_code == 200:
|
|
||||||
print("✓ Start command accepted")
|
|
||||||
print(f"Response: {json.dumps(response.json(), indent=2)}")
|
|
||||||
print("\nNote: Sleep mode was disabled before starting measurement")
|
|
||||||
return True
|
|
||||||
elif response.status_code == 404:
|
|
||||||
print("✗ Device config not found (create config first)")
|
|
||||||
return False
|
|
||||||
elif response.status_code == 502:
|
|
||||||
print("✗ Device not reachable (expected if no physical device)")
|
|
||||||
print(f"Response: {response.text}")
|
|
||||||
print("\nNote: This is expected behavior when testing without a physical device")
|
|
||||||
return True # This is actually success - the endpoint tried to communicate
|
|
||||||
else:
|
|
||||||
print(f"✗ Request failed: {response.status_code}")
|
|
||||||
print(f"Error: {response.text}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
def main():
|
|
||||||
print("=" * 60)
|
|
||||||
print("Sleep Mode Auto-Disable Test")
|
|
||||||
print("=" * 60)
|
|
||||||
print("\nThis test verifies that sleep mode is automatically disabled")
|
|
||||||
print("when device configs are updated or measurements are started.")
|
|
||||||
print("\nNote: Without a physical device, some operations will fail at")
|
|
||||||
print("the device communication level, but the API logic will execute.")
|
|
||||||
|
|
||||||
# Run tests
|
|
||||||
results = []
|
|
||||||
|
|
||||||
# Test 1: Update config (should attempt to disable sleep mode)
|
|
||||||
results.append(("Config Update", test_config_update()))
|
|
||||||
|
|
||||||
# Test 2: Get config
|
|
||||||
results.append(("Get Config", test_get_config()))
|
|
||||||
|
|
||||||
# Test 3: Start measurement (should attempt to disable sleep mode)
|
|
||||||
results.append(("Start Measurement", test_start_measurement()))
|
|
||||||
|
|
||||||
# Summary
|
|
||||||
print("\n" + "=" * 60)
|
|
||||||
print("Test Summary")
|
|
||||||
print("=" * 60)
|
|
||||||
|
|
||||||
for test_name, result in results:
|
|
||||||
status = "✓ PASS" if result else "✗ FAIL"
|
|
||||||
print(f"{status}: {test_name}")
|
|
||||||
|
|
||||||
print("\n" + "=" * 60)
|
|
||||||
print("Implementation Details:")
|
|
||||||
print("=" * 60)
|
|
||||||
print("1. Config endpoint is now async and calls ensure_sleep_mode_disabled()")
|
|
||||||
print(" when TCP is enabled")
|
|
||||||
print("2. Start measurement endpoint calls ensure_sleep_mode_disabled()")
|
|
||||||
print(" before starting the measurement")
|
|
||||||
print("3. Sleep mode check is non-blocking - config/start will succeed")
|
|
||||||
print(" even if the device is unreachable")
|
|
||||||
print("=" * 60)
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
Reference in New Issue
Block a user