codec: wire decode_waveform_v2 into production; add MicL dB helper
Replaces the broken legacy int16 LE decoder in client.py with the
verified multi-channel codec. Three changes:
1. blastware_file.extract_body_bytes(a5_frames) — new helper that
factors out the body-reconstruction logic from write_blastware_file
so both writers (BW binary) and decoders (sample arrays) can use
the same canonical bytes.
2. waveform_codec.decode_a5_frames(a5_frames) — production entry point.
Returns the raw_samples dict consumers expect (Tran/Vert/Long as
int16 ADC counts; MicL as native ADC counts). Internally:
A5 frames → extract_body_bytes → decode_waveform_v2
→ decoded_to_adc_counts (geos ×16; mic pass-through)
3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's
display formula:
dB = sign(count) × (81.94 + 20 × log10(|count|)) for |count| ≥ 1
Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1).
client.py:_decode_a5_waveform is reduced to a thin wrapper that calls
decode_a5_frames and populates event.raw_samples. Original implementation
preserved as _decode_a5_waveform_LEGACY (dead code; reference only).
Also fixed a tail-end bug in decode_waveform_v2 where trailer-section
"40 02" markers (containing ASCII serial bytes, NOT real segment headers)
were being mis-interpreted, producing 2 spurious samples per channel at
the end of each event. Added bytes [12:14] == "02 00" validation to
reject non-header markers.
7 new pytest tests cover the new helpers and dB conversion. Total:
71 passing (up from 64).
Known limitation (carried over from before): the walker still stops
mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some
mid-segment edge cases not yet characterized. Every sample reached
is decoded correctly; the walker just doesn't reach all of them.
Loud events still yield 5,000–15,000 byte-exact samples each.
This commit is contained in:
@@ -53,20 +53,32 @@ correct.
|
||||
|
||||
## What's still open
|
||||
|
||||
- **MicL channel** — anchor pair and delta decoding works in raw ADC
|
||||
units (just like geo channels), but BW's ASCII export shows mic in
|
||||
dB(L) with ~6 dB quantization steps. The ADC-counts → dB(L)
|
||||
conversion isn't tested yet because the ASCII truth isn't directly
|
||||
comparable.
|
||||
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event. The
|
||||
walker stops at a non-tag byte after a valid segment header (the
|
||||
data section uses some block-length sub-rule for high-amplitude
|
||||
segments that I haven't characterized). Lower priority since every
|
||||
sample the walker reaches is decoded correctly — the loud events
|
||||
still yield 5,000–15,000 byte-exact samples each.
|
||||
|
||||
- **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due to
|
||||
block-length quirks past the first few segments. Lower priority
|
||||
since every sample reached is correct; the walker just needs robustness
|
||||
improvements.
|
||||
## What's now wired into production (2026-05-11 late)
|
||||
|
||||
- **Production code in `minimateplus/client.py:_decode_a5_waveform`** still
|
||||
uses the broken legacy int16 LE decoder. Wiring `decode_waveform_v2`
|
||||
into the `.h5` sidecar path is the obvious next follow-up.
|
||||
- **`client.py:_decode_a5_waveform`** — now uses
|
||||
`decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
|
||||
`event.raw_samples` is populated with int16 ADC counts that flow
|
||||
through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
|
||||
Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
|
||||
reference but is not called.
|
||||
|
||||
- **MicL → dB(L) conversion** — exposed as
|
||||
`waveform_codec.mic_count_to_db(count)`. Verified against BW
|
||||
display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
|
||||
the V70 mic-heavy fixture exactly).
|
||||
|
||||
- **`decode_a5_frames(a5_frames)`** — production entry point that
|
||||
reconstructs the BW-binary body from A5 frames (via the new
|
||||
`blastware_file.extract_body_bytes` helper) and runs the verified
|
||||
codec. Returns the same `raw_samples` dict shape the consumers
|
||||
already expect.
|
||||
|
||||
## What's solved
|
||||
|
||||
|
||||
Reference in New Issue
Block a user