Files
terra-view/REPORT_PIPELINE_BRIEF.md
serversdown ed195ed96b feat(reports): FTP night-report pipeline foundation
Terra-View side of the daily night-vs-baseline sound report for the John Myler
24/7 job. Engine is built and verified end-to-end against real meter data;
SMTP send + scheduler/capture wiring still pending.

- ingest: refactor upload_nrl_data into a callable ingest_nrl_zip(location_id,
  zip_bytes, db) sharing one core with the HTTP endpoint. Capture the .rnh
  percentile map + weightings into session metadata; dedup on store-name +
  start time. Ingest stays metric-agnostic (every Leq column preserved).
- report_pipeline.py: metric registry, Evening/Nighttime windows, correct
  aggregation (Lmax=max, Ln=arithmetic, Leq=logarithmic), baseline = typical
  night, per-location + per-project builders.
- report_renderers.py: HTML email-body renderer (Last/Base/delta layout).
- report_email.py: config-driven SMTP via stdlib (env vars) with a dry-run
  fallback so the pipeline runs without credentials.
- report_orchestrator.py: compute -> render -> always write report.html +
  report.json to disk -> best-effort email.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 20:41:05 +00:00

3.7 KiB
Raw Permalink Blame History

FTP Report Pipeline — session brief

Branch: feat/ftp-report-pipeline (off dev), worktree /home/serversdown/terra-view-reports. Scope: Terra-View only. Do NOT touch SLMM — the SLMM alert/monitor work is live in a parallel session on slmm branch feat/drd-fix. Pull device data through the existing SLMM FTP proxy endpoints; add no SLMM code (for v1).

See memory note client_sound_monitoring_job_2026-07 for the client requirements + timeline.

Goal

Automated daily morning report for the John Myler 3-location sound job: each AM, last night's noise levels vs the baseline week, per location. Data pulled from the meters via FTP (the meter records 24/7 to SD regardless of TCP wedges). Alerts are a separate workstream (SLMM, real-time DOD) — not in scope here.

The big realization (why this is small)

The hard parts already exist:

  • SLMM (use as-is, via the /api/slmm/... proxy):
    • GET /api/slmm/{unit}/ftp/files?path=/NL-43 → list files/folders
    • POST /api/slmm/{unit}/ftp/download-folder → returns the Auto_#### folder as a ZIP
  • Terra-View ingest (reuse): backend/routers/project_locations.py:1743 upload_nrl_data already accepts a ZIP, extracts, keeps .rnh + _Leq_ .rnd (drops _Lp_/junk via _is_wanted), runs _parse_rnh (line 1687) → creates MonitoringSession + DataFile.
  • Report generator (reuse, source-agnostic): backend/routers/projects.py. The .rnd file reads funnel through 3 helpers — _peek_rnd_headers (~135), _is_leq_file (~147), _read_rnd_file_rows (~256). .rnd files live on disk under data/{file_path} (DataFile holds the path, not a BLOB). The stats/Excel/formatting logic doesn't care where bytes come from.

Build (Terra-View)

  1. Refactor upload_nrl_data's core into a callable ingest_nrl_zip(location_id, zip_bytes, db) so it can be invoked programmatically (not only via HTTP UploadFile).
  2. Scheduled pull job (reuse the existing scheduler): per project location/unit → GET /ftp/files to find new Auto_#### folders → POST /ftp/download-folder (zip) → ingest_nrl_zip(...). Dedup so repeated pulls don't duplicate sessions/files (track ingested folder names per location).
  3. Baseline aggregation: aggregate the baseline-week _Leq_ intervals per location → reference values (nighttime Leq, L90 floor, typical Lmax).
  4. Nightly report + email: compute last night's metrics per location, compare to baseline (deltas), render (reuse the Excel/report machinery), email each morning.

Data-location decision (light version, agreed)

Keep MonitoringSession/DataFile metadata in TV for now; reuse the existing on-disk file store. Optional refinement (later): have SLMM keep the pulled files and TV read them through a SLMM file-serve endpoint (avoids the copy-into-TV step). Don't do that refinement under the deadline unless trivial — the report logic is identical either way.

Open questions to resolve early

  1. What's actually in a _Leq_ .rnd — Leq only, or Leq + Lmax + Ln per 15-min interval? Decides whether the night-vs-baseline report can show L90/Lmax or just Leq. Inspect a real file.
  2. Session rollover / dedup — does a 2-week run write one growing Auto_#### folder or new folders? Drives the "what's new" logic.
  3. download-folder over a multi-day run — confirm it zips cleanly (size/time).

Client params (confirm with Dave before locking)

Threshold/metric + their "night" window; report recipients + format (email body vs PDF/Excel).

Timeline

Setup ~7/17/2 (baseline week), shutdown week through ~7/17. Reports needed by ~7/8 (before shutdown). Today is ~3 weeks out — reliability > features.