Files
terra-view/REPORT_PIPELINE_BRIEF.md
T
serversdown ed195ed96b feat(reports): FTP night-report pipeline foundation
Terra-View side of the daily night-vs-baseline sound report for the John Myler
24/7 job. Engine is built and verified end-to-end against real meter data;
SMTP send + scheduler/capture wiring still pending.

- ingest: refactor upload_nrl_data into a callable ingest_nrl_zip(location_id,
  zip_bytes, db) sharing one core with the HTTP endpoint. Capture the .rnh
  percentile map + weightings into session metadata; dedup on store-name +
  start time. Ingest stays metric-agnostic (every Leq column preserved).
- report_pipeline.py: metric registry, Evening/Nighttime windows, correct
  aggregation (Lmax=max, Ln=arithmetic, Leq=logarithmic), baseline = typical
  night, per-location + per-project builders.
- report_renderers.py: HTML email-body renderer (Last/Base/delta layout).
- report_email.py: config-driven SMTP via stdlib (env vars) with a dry-run
  fallback so the pipeline runs without credentials.
- report_orchestrator.py: compute -> render -> always write report.html +
  report.json to disk -> best-effort email.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 20:41:05 +00:00

60 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FTP Report Pipeline — session brief
**Branch:** `feat/ftp-report-pipeline` (off `dev`), worktree `/home/serversdown/terra-view-reports`.
**Scope:** Terra-View only. Do NOT touch SLMM — the SLMM alert/monitor work is live in a
parallel session on `slmm` branch `feat/drd-fix`. Pull device data through the **existing**
SLMM FTP proxy endpoints; add no SLMM code (for v1).
See memory note `client_sound_monitoring_job_2026-07` for the client requirements + timeline.
## Goal
Automated **daily morning report** for the John Myler 3-location sound job: each AM, last
night's noise levels vs the **baseline week**, per location. Data pulled from the meters via
FTP (the meter records 24/7 to SD regardless of TCP wedges). Alerts are a *separate* workstream
(SLMM, real-time DOD) — not in scope here.
## The big realization (why this is small)
The hard parts already exist:
- **SLMM (use as-is, via the `/api/slmm/...` proxy):**
- `GET /api/slmm/{unit}/ftp/files?path=/NL-43` → list files/folders
- `POST /api/slmm/{unit}/ftp/download-folder` → returns the `Auto_####` folder as a **ZIP**
- **Terra-View ingest (reuse):** `backend/routers/project_locations.py:1743` `upload_nrl_data`
already accepts a **ZIP**, extracts, keeps `.rnh` + `_Leq_ .rnd` (drops `_Lp_`/junk via
`_is_wanted`), runs `_parse_rnh` (line 1687) → creates `MonitoringSession` + `DataFile`.
- **Report generator (reuse, source-agnostic):** `backend/routers/projects.py`. The `.rnd`
file reads funnel through 3 helpers — `_peek_rnd_headers` (~135), `_is_leq_file` (~147),
`_read_rnd_file_rows` (~256). `.rnd` files live on disk under `data/{file_path}` (DataFile
holds the path, not a BLOB). The stats/Excel/formatting logic doesn't care where bytes come from.
## Build (Terra-View)
1. **Refactor** `upload_nrl_data`'s core into a callable `ingest_nrl_zip(location_id, zip_bytes, db)`
so it can be invoked programmatically (not only via HTTP UploadFile).
2. **Scheduled pull job** (reuse the existing scheduler): per project location/unit →
`GET /ftp/files` to find new `Auto_####` folders → `POST /ftp/download-folder` (zip) →
`ingest_nrl_zip(...)`. **Dedup** so repeated pulls don't duplicate sessions/files
(track ingested folder names per location).
3. **Baseline aggregation:** aggregate the baseline-week `_Leq_` intervals per location →
reference values (nighttime Leq, L90 floor, typical Lmax).
4. **Nightly report + email:** compute last night's metrics per location, compare to baseline
(deltas), render (reuse the Excel/report machinery), email each morning.
## Data-location decision (light version, agreed)
Keep `MonitoringSession`/`DataFile` **metadata in TV** for now; reuse the existing on-disk file
store. Optional refinement (later): have SLMM keep the pulled files and TV read them through a
SLMM file-serve endpoint (avoids the copy-into-TV step). Don't do that refinement under the
deadline unless trivial — the report logic is identical either way.
## Open questions to resolve early
1. **What's actually in a `_Leq_ .rnd`** — Leq only, or Leq + Lmax + Ln per 15-min interval?
Decides whether the night-vs-baseline report can show L90/Lmax or just Leq. Inspect a real file.
2. **Session rollover / dedup** — does a 2-week run write one growing `Auto_####` folder or new
folders? Drives the "what's new" logic.
3. **`download-folder` over a multi-day run** — confirm it zips cleanly (size/time).
## Client params (confirm with Dave before locking)
Threshold/metric + their "night" window; report recipients + format (email body vs PDF/Excel).
## Timeline
Setup ~7/17/2 (baseline week), shutdown week through ~7/17. Reports needed by ~7/8 (before
shutdown). Today is ~3 weeks out — reliability > features.