minimateplus: wire read_blastware_file to verified body codec #24
Reference in New Issue
Block a user
Delete Branch "feat/wire-codec-to-import-path"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
`read_blastware_file()` was still calling `_decode_samples_4ch_int16_le` (the retracted int16-LE-interleaved hypothesis) on the body bytes, producing ±32K noise on every channel of every BW file read from disk. This was the path watcher-forwarded events take into the system (via the import endpoint → save_imported_bw → read_blastware_file, since the watcher doesn't ship A5 frames), so every .h5 sidecar generated for a forwarded event has been wrong since the feature shipped. The fix is mechanical: pass the body bytes straight to `waveform_codec.decode_waveform_v2()` and run the result through `decoded_to_adc_counts()` for the 16x geo scaling. The body already starts with the codec's exact 7-byte preamble `00 02 00 [Tran[0] BE] [Tran[1] BE]` — confirmed by `body[:3].hex()` across all 9 fixture events. No body-slice adjustment needed. If the codec returns None (truncated/malformed file, synthetic test input with no real waveform), fall back to empty channels with a log warning. The rest of the event (timestamp, waveform_key, project strings, sensor_location, peaks-from-samples=0) is still recoverable. Verified against the bundled fixture corpus: V70 Tran/Vert/Long 3328/3328 sample-sets match .TXT ground truth within the 0.005 in/s display quantum, every row 6S0/RG0/AB0/470 (5-8-26) 3328/2304/1280/1280 samples; Vert PPVs match BW's own report within 0.02 in/s JQ0 3328 samples, Vert PPV 3.384 vs BW 3.465 SP0/SS0/SV0 (loud events) 3072–3328 samples; known walker tail-truncation 1–7 samples per channel, samples reached are byte-exact Existing `test_read_blastware_file_round_trip` (synthetic empty event) continues to pass thanks to the None-fallback. Codec verify scripts (`analysis/verify_quiet_bundle.py`, `analysis/verify_full_decode.py`) re-run unchanged. Added two regression-lock tests in tests/test_event_file_io.py: - test_read_blastware_file_decodes_via_codec[6 fixtures] — verifies sample count + Vert PPV per fixture - test_read_blastware_file_v70_samples_match_txt_truth — verifies every one of V70's 3328 sample-sets across Tran/Vert/Long matches the .TXT ground truth row-by-row within 0.003 in/s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>Two coupled changes that close the rollout gap left by the read_blastware_file codec wiring: 1. minimateplus/event_file_io.py: bump TOOL_VERSION from 0.16.1 to 0.20.0. This is the version stamp the backfill script reads from each sidecar's source.tool_version field to detect "this sidecar was written before the current decoder shipped, regenerate it." Bumping past every value baked into existing prod sidecars flags them all as stale on the next backfill run — which is exactly what we want, since every pre-codec-wiring sidecar was written by the retracted int16-LE decoder. 2. scripts/backfill_sidecars.py: when the sidecar is being regenerated this iteration (sha mismatch, tool_version too old, or --force), also regenerate the .h5. Previously the .h5 logic only rewrote when --force was passed or the file was missing — so a tool_version-driven sidecar regen left the broken .h5 in place forever. Added a `sidecar_stale` boolean to track the "we're rewriting the sidecar this iteration" state and wired it into the h5 need-rewrite check. Path coverage (verified by trace): - sidecar missing → both regen - --force → both regen - sha mismatch → both regen - tool_ver too old → both regen (THE post-codec-wiring case) - everything OK → skip iteration entirely (h5 untouched) Operator review state (review.false_trigger, reviewer, notes) and the sidecar's extensions block are preserved across regen by the existing read-existing-sidecar / pass-into-event_to_sidecar_dict path — unchanged from prior behavior. Deploy procedure (on prod): 1. Pull this change + the read_blastware_file codec wiring. 2. `python scripts/backfill_sidecars.py --dry-run` to preview. Every sidecar with source.tool_version<0.20.0 will show as "would (re)write". 3. Run for real (drop --dry-run). Expect every pre-fix event to regen. Big stores may take a while. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>