codec: wire decode_waveform_v2 into production; add MicL dB helper

Replaces the broken legacy int16 LE decoder in client.py with the verified multi-channel codec. Three changes: 1. blastware_file.extract_body_bytes(a5_frames) — new helper that factors out the body-reconstruction logic from write_blastware_file so both writers (BW binary) and decoders (sample arrays) can use the same canonical bytes. 2. waveform_codec.decode_a5_frames(a5_frames) — production entry point. Returns the raw_samples dict consumers expect (Tran/Vert/Long as int16 ADC counts; MicL as native ADC counts). Internally: A5 frames → extract_body_bytes → decode_waveform_v2 → decoded_to_adc_counts (geos ×16; mic pass-through) 3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's display formula: dB = sign(count) × (81.94 + 20 × log10(|count|)) for |count| ≥ 1 Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1). client.py:_decode_a5_waveform is reduced to a thin wrapper that calls decode_a5_frames and populates event.raw_samples. Original implementation preserved as _decode_a5_waveform_LEGACY (dead code; reference only). Also fixed a tail-end bug in decode_waveform_v2 where trailer-section "40 02" markers (containing ASCII serial bytes, NOT real segment headers) were being mis-interpreted, producing 2 spurious samples per channel at the end of each event. Added bytes [12:14] == "02 00" validation to reject non-header markers. 7 new pytest tests cover the new helpers and dB conversion. Total: 71 passing (up from 64). Known limitation (carried over from before): the walker still stops mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some mid-segment edge cases not yet characterized. Every sample reached is decoded correctly; the walker just doesn't reach all of them. Loud events still yield 5,000–15,000 byte-exact samples each.
2026-05-16 00:27:14 +00:00
parent 2ff2762eec
commit 85f4bcfe86
6 changed files with 370 additions and 46 deletions
@@ -53,20 +53,32 @@ correct.

 ## What's still open

- **MicL channel** — anchor pair and delta decoding works in raw ADC
-  units (just like geo channels), but BW's ASCII export shows mic in
-  dB(L) with ~6 dB quantization steps.  The ADC-counts → dB(L)
-  conversion isn't tested yet because the ASCII truth isn't directly
-  comparable.
+- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event.  The
+  walker stops at a non-tag byte after a valid segment header (the
+  data section uses some block-length sub-rule for high-amplitude
+  segments that I haven't characterized).  Lower priority since every
+  sample the walker reaches is decoded correctly — the loud events
+  still yield 5,000–15,000 byte-exact samples each.

- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due to
-  block-length quirks past the first few segments.  Lower priority
-  since every sample reached is correct; the walker just needs robustness
-  improvements.
+## What's now wired into production (2026-05-11 late)

- **Production code in `minimateplus/client.py:_decode_a5_waveform`** still
-  uses the broken legacy int16 LE decoder.  Wiring `decode_waveform_v2`
-  into the `.h5` sidecar path is the obvious next follow-up.
+- **`client.py:_decode_a5_waveform`** — now uses
+  `decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
+  `event.raw_samples` is populated with int16 ADC counts that flow
+  through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
+  Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
+  reference but is not called.
+
+- **MicL → dB(L) conversion** — exposed as
+  `waveform_codec.mic_count_to_db(count)`.  Verified against BW
+  display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
+  the V70 mic-heavy fixture exactly).
+
+- **`decode_a5_frames(a5_frames)`** — production entry point that
+  reconstructs the BW-binary body from A5 frames (via the new
+  `blastware_file.extract_body_bytes` helper) and runs the verified
+  codec.  Returns the same `raw_samples` dict shape the consumers
+  already expect.

 ## What's solved