Commit Graph

2 Commits

Author SHA1 Message Date
serversdown 25386cab8b fix(backfill): regenerate IDFH .h5 + merge binary mic_pspl_psi onto bridge
Two gaps in backfill_thor_events.py that left old Thor events showing
stale charts after a v0.21.1 backfill pass:
1. IDFH events were skipped from .h5 regeneration (the "have decoded
   samples" gate was IDFW-only).  Histograms kept their pre-v0.21.1
   .h5 — written from raw_samples = None, which the renderer turned
   into a near-empty bar chart, or for older events the dB(L)-as-pseudo-
   psi mic scale that produced "107.7 psi" peaks (atomic-bomb level
   instead of footstep level).  Fix: synthesise the same 1-sample-per-
   interval array save_imported_idf v0.21.1 uses (peak ADC count per
   channel per interval) so the renderer's bar-chart grouping has
   data to work with.
2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the
   IdfEvent before to_minimateplus_event().  The live save_imported_idf
   does this merge — without it, IdfEvent.from_report() only sees the
   .txt's dB(L) value, the bridge falls back to the dBL→psi formula
   (instead of the binary-accurate 2.14e-6 psi/count value), and the
   h5 writer's per-count mic factor lands on a less-correct value.
   Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi
   onto idf_event.peaks before the bridge call).
Verified against UM6047_20250804190047.IDFH (250-interval prod
histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being
treated as dB(L)=107.7 in the old h5).
Operator: re-run after deploy.  `docker compose exec sfm python
scripts/backfill_thor_events.py` is idempotent — the existing version
check still skips events already at the new TOOL_VERSION, and review
state + captured_at are preserved on the second pass.
2026-06-01 20:02:54 +00:00
serversdown 43f440812a scripts: add backfill_thor_events.py
Refreshes the bw_report sidecar block + .h5 waveform files for Thor
events ingested before the v0.21.0 adapter wiring + the bee1185 codec
fix.  Those events landed with extensions.idf_report only (no
bw_report, no .h5 for IDFW) — symptom on the UI side: the modal chart
404'd on /waveform.json and the PDF rendered from DB-only fields
without sensor self-check, full per-channel breakdown, or mic dB(L).

Walks <store>/<serial>/<filename>:
  - Reads the existing sidecar (preserves review state + captured_at)
  - Re-runs read_idf_file() on the binary bytes (passes data=
    kwarg so codec doesn't try the broken bare-path Path.read_bytes)
  - Reads extensions.idf_report from the existing sidecar
  - Runs build_bw_report_from_idf adapter
  - Writes refreshed sidecar with bw_report + bumped tool_version,
    preserving review block and original captured_at
  - For IDFW: regenerates .h5 by bridging IdfEvent.from_report ->
    to_minimateplus_event -> write_event_hdf5 (mirrors save_imported_idf
    steps 4-7)
  - IDFH events skip .h5 (histograms have no per-sample data)

Skips events already at current TOOL_VERSION with bw_report present.
--force overrides.  --skip-hdf5 limits to sidecar-only refresh.
--dry-run for preview.

Validated against the prod-snap waveform store: 3,815 Thor sidecars
refreshed cleanly with 0 errors, 462 IDFW .h5 files written, 2 skipped
(binaries with no sidecar — backfill doesn't conjure events from
nothing).  Verified one originally-broken IDFW event now serves
waveform.json (200, 168KB) and a fully populated PDF (119KB vs the
previous 56KB sparse output).

Operator workflow on prod:
  docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py --dry-run
  # Inspect counts, then for real:
  docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py

Idempotent — re-running it is a no-op once everything's at the current
TOOL_VERSION.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 04:37:43 +00:00