Two gaps in backfill_thor_events.py that left old Thor events showing
stale charts after a v0.21.1 backfill pass:
1. IDFH events were skipped from .h5 regeneration (the "have decoded
samples" gate was IDFW-only). Histograms kept their pre-v0.21.1
.h5 — written from raw_samples = None, which the renderer turned
into a near-empty bar chart, or for older events the dB(L)-as-pseudo-
psi mic scale that produced "107.7 psi" peaks (atomic-bomb level
instead of footstep level). Fix: synthesise the same 1-sample-per-
interval array save_imported_idf v0.21.1 uses (peak ADC count per
channel per interval) so the renderer's bar-chart grouping has
data to work with.
2. The IDFW h5 path didn't merge binary_peaks.mic_pspl_psi onto the
IdfEvent before to_minimateplus_event(). The live save_imported_idf
does this merge — without it, IdfEvent.from_report() only sees the
.txt's dB(L) value, the bridge falls back to the dBL→psi formula
(instead of the binary-accurate 2.14e-6 psi/count value), and the
h5 writer's per-count mic factor lands on a less-correct value.
Fix: same merge the live ingest does (lift res.event.peaks.mic_pspl_psi
onto idf_event.peaks before the bridge call).
Verified against UM6047_20250804190047.IDFH (250-interval prod
histogram): 250 intervals decode, mic_pspl_psi = 2.78e-5 (was being
treated as dB(L)=107.7 in the old h5).
Operator: re-run after deploy. `docker compose exec sfm python
scripts/backfill_thor_events.py` is idempotent — the existing version
check still skips events already at the new TOOL_VERSION, and review
state + captured_at are preserved on the second pass.
Refreshes the bw_report sidecar block + .h5 waveform files for Thor
events ingested before the v0.21.0 adapter wiring + the bee1185 codec
fix. Those events landed with extensions.idf_report only (no
bw_report, no .h5 for IDFW) — symptom on the UI side: the modal chart
404'd on /waveform.json and the PDF rendered from DB-only fields
without sensor self-check, full per-channel breakdown, or mic dB(L).
Walks <store>/<serial>/<filename>:
- Reads the existing sidecar (preserves review state + captured_at)
- Re-runs read_idf_file() on the binary bytes (passes data=
kwarg so codec doesn't try the broken bare-path Path.read_bytes)
- Reads extensions.idf_report from the existing sidecar
- Runs build_bw_report_from_idf adapter
- Writes refreshed sidecar with bw_report + bumped tool_version,
preserving review block and original captured_at
- For IDFW: regenerates .h5 by bridging IdfEvent.from_report ->
to_minimateplus_event -> write_event_hdf5 (mirrors save_imported_idf
steps 4-7)
- IDFH events skip .h5 (histograms have no per-sample data)
Skips events already at current TOOL_VERSION with bw_report present.
--force overrides. --skip-hdf5 limits to sidecar-only refresh.
--dry-run for preview.
Validated against the prod-snap waveform store: 3,815 Thor sidecars
refreshed cleanly with 0 errors, 462 IDFW .h5 files written, 2 skipped
(binaries with no sidecar — backfill doesn't conjure events from
nothing). Verified one originally-broken IDFW event now serves
waveform.json (200, 168KB) and a fully populated PDF (119KB vs the
previous 56KB sparse output).
Operator workflow on prod:
docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py --dry-run
# Inspect counts, then for real:
docker exec <sfm-container> python3 /app/scripts/backfill_thor_events.py
Idempotent — re-running it is a no-op once everything's at the current
TOOL_VERSION.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>