codec: wire decode_waveform_v2 into production; add MicL dB helper

Replaces the broken legacy int16 LE decoder in client.py with the
verified multi-channel codec.  Three changes:

1. blastware_file.extract_body_bytes(a5_frames) — new helper that
   factors out the body-reconstruction logic from write_blastware_file
   so both writers (BW binary) and decoders (sample arrays) can use
   the same canonical bytes.

2. waveform_codec.decode_a5_frames(a5_frames) — production entry point.
   Returns the raw_samples dict consumers expect (Tran/Vert/Long as
   int16 ADC counts; MicL as native ADC counts).  Internally:
     A5 frames → extract_body_bytes → decode_waveform_v2
                → decoded_to_adc_counts (geos ×16; mic pass-through)

3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's
   display formula:
     dB = sign(count) × (81.94 + 20 × log10(|count|))   for |count| ≥ 1
   Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1).

client.py:_decode_a5_waveform is reduced to a thin wrapper that calls
decode_a5_frames and populates event.raw_samples.  Original implementation
preserved as _decode_a5_waveform_LEGACY (dead code; reference only).

Also fixed a tail-end bug in decode_waveform_v2 where trailer-section
"40 02" markers (containing ASCII serial bytes, NOT real segment headers)
were being mis-interpreted, producing 2 spurious samples per channel at
the end of each event.  Added bytes [12:14] == "02 00" validation to
reject non-header markers.

7 new pytest tests cover the new helpers and dB conversion.  Total:
71 passing (up from 64).

Known limitation (carried over from before): the walker still stops
mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some
mid-segment edge cases not yet characterized.  Every sample reached
is decoded correctly; the walker just doesn't reach all of them.
Loud events still yield 5,000–15,000 byte-exact samples each.
This commit is contained in:
Claude
2026-05-16 00:27:14 +00:00
committed by serversdown
parent 2ff2762eec
commit 85f4bcfe86
6 changed files with 370 additions and 46 deletions
+24 -12
View File
@@ -53,20 +53,32 @@ correct.
## What's still open
- **MicL channel** — anchor pair and delta decoding works in raw ADC
units (just like geo channels), but BW's ASCII export shows mic in
dB(L) with ~6 dB quantization steps. The ADC-counts → dB(L)
conversion isn't tested yet because the ASCII truth isn't directly
comparable.
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event. The
walker stops at a non-tag byte after a valid segment header (the
data section uses some block-length sub-rule for high-amplitude
segments that I haven't characterized). Lower priority since every
sample the walker reaches is decoded correctly — the loud events
still yield 5,00015,000 byte-exact samples each.
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due to
block-length quirks past the first few segments. Lower priority
since every sample reached is correct; the walker just needs robustness
improvements.
## What's now wired into production (2026-05-11 late)
- **Production code in `minimateplus/client.py:_decode_a5_waveform`** still
uses the broken legacy int16 LE decoder. Wiring `decode_waveform_v2`
into the `.h5` sidecar path is the obvious next follow-up.
- **`client.py:_decode_a5_waveform`** — now uses
`decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
`event.raw_samples` is populated with int16 ADC counts that flow
through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
reference but is not called.
- **MicL → dB(L) conversion** — exposed as
`waveform_codec.mic_count_to_db(count)`. Verified against BW
display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
the V70 mic-heavy fixture exactly).
- **`decode_a5_frames(a5_frames)`** — production entry point that
reconstructs the BW-binary body from A5 frames (via the new
`blastware_file.extract_body_bytes` helper) and runs the verified
codec. Returns the same `raw_samples` dict shape the consumers
already expect.
## What's solved