feat(reports): FTP night-report pipeline foundation

Terra-View side of the daily night-vs-baseline sound report for the John Myler
24/7 job. Engine is built and verified end-to-end against real meter data;
SMTP send + scheduler/capture wiring still pending.

- ingest: refactor upload_nrl_data into a callable ingest_nrl_zip(location_id,
  zip_bytes, db) sharing one core with the HTTP endpoint. Capture the .rnh
  percentile map + weightings into session metadata; dedup on store-name +
  start time. Ingest stays metric-agnostic (every Leq column preserved).
- report_pipeline.py: metric registry, Evening/Nighttime windows, correct
  aggregation (Lmax=max, Ln=arithmetic, Leq=logarithmic), baseline = typical
  night, per-location + per-project builders.
- report_renderers.py: HTML email-body renderer (Last/Base/delta layout).
- report_email.py: config-driven SMTP via stdlib (env vars) with a dry-run
  fallback so the pipeline runs without credentials.
- report_orchestrator.py: compute -> render -> always write report.html +
  report.json to disk -> best-effort email.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-10 20:41:05 +00:00
parent 38f2c751b8
commit ed195ed96b
6 changed files with 1142 additions and 144 deletions
+59
View File
@@ -0,0 +1,59 @@
# FTP Report Pipeline — session brief
**Branch:** `feat/ftp-report-pipeline` (off `dev`), worktree `/home/serversdown/terra-view-reports`.
**Scope:** Terra-View only. Do NOT touch SLMM — the SLMM alert/monitor work is live in a
parallel session on `slmm` branch `feat/drd-fix`. Pull device data through the **existing**
SLMM FTP proxy endpoints; add no SLMM code (for v1).
See memory note `client_sound_monitoring_job_2026-07` for the client requirements + timeline.
## Goal
Automated **daily morning report** for the John Myler 3-location sound job: each AM, last
night's noise levels vs the **baseline week**, per location. Data pulled from the meters via
FTP (the meter records 24/7 to SD regardless of TCP wedges). Alerts are a *separate* workstream
(SLMM, real-time DOD) — not in scope here.
## The big realization (why this is small)
The hard parts already exist:
- **SLMM (use as-is, via the `/api/slmm/...` proxy):**
- `GET /api/slmm/{unit}/ftp/files?path=/NL-43` → list files/folders
- `POST /api/slmm/{unit}/ftp/download-folder` → returns the `Auto_####` folder as a **ZIP**
- **Terra-View ingest (reuse):** `backend/routers/project_locations.py:1743` `upload_nrl_data`
already accepts a **ZIP**, extracts, keeps `.rnh` + `_Leq_ .rnd` (drops `_Lp_`/junk via
`_is_wanted`), runs `_parse_rnh` (line 1687) → creates `MonitoringSession` + `DataFile`.
- **Report generator (reuse, source-agnostic):** `backend/routers/projects.py`. The `.rnd`
file reads funnel through 3 helpers — `_peek_rnd_headers` (~135), `_is_leq_file` (~147),
`_read_rnd_file_rows` (~256). `.rnd` files live on disk under `data/{file_path}` (DataFile
holds the path, not a BLOB). The stats/Excel/formatting logic doesn't care where bytes come from.
## Build (Terra-View)
1. **Refactor** `upload_nrl_data`'s core into a callable `ingest_nrl_zip(location_id, zip_bytes, db)`
so it can be invoked programmatically (not only via HTTP UploadFile).
2. **Scheduled pull job** (reuse the existing scheduler): per project location/unit →
`GET /ftp/files` to find new `Auto_####` folders → `POST /ftp/download-folder` (zip) →
`ingest_nrl_zip(...)`. **Dedup** so repeated pulls don't duplicate sessions/files
(track ingested folder names per location).
3. **Baseline aggregation:** aggregate the baseline-week `_Leq_` intervals per location →
reference values (nighttime Leq, L90 floor, typical Lmax).
4. **Nightly report + email:** compute last night's metrics per location, compare to baseline
(deltas), render (reuse the Excel/report machinery), email each morning.
## Data-location decision (light version, agreed)
Keep `MonitoringSession`/`DataFile` **metadata in TV** for now; reuse the existing on-disk file
store. Optional refinement (later): have SLMM keep the pulled files and TV read them through a
SLMM file-serve endpoint (avoids the copy-into-TV step). Don't do that refinement under the
deadline unless trivial — the report logic is identical either way.
## Open questions to resolve early
1. **What's actually in a `_Leq_ .rnd`** — Leq only, or Leq + Lmax + Ln per 15-min interval?
Decides whether the night-vs-baseline report can show L90/Lmax or just Leq. Inspect a real file.
2. **Session rollover / dedup** — does a 2-week run write one growing `Auto_####` folder or new
folders? Drives the "what's new" logic.
3. **`download-folder` over a multi-day run** — confirm it zips cleanly (size/time).
## Client params (confirm with Dave before locking)
Threshold/metric + their "night" window; report recipients + format (email body vs PDF/Excel).
## Timeline
Setup ~7/17/2 (baseline week), shutdown week through ~7/17. Reports needed by ~7/8 (before
shutdown). Today is ~3 weeks out — reliability > features.