# Waveform body codec — current working status (2026-05-11) This is the **clean working note** for the body-codec reverse-engineering effort. It supersedes scattered claims elsewhere when they conflict. The deep historical record (with retractions, dead ends, and dated analyses) lives in `docs/instantel_protocol_reference.md §7.6.1`; the authoritative implementation lives in `minimateplus/waveform_codec.py`. ## TL;DR The Blastware waveform-file body is a **tagged variable-length block stream**, NOT raw int16 LE samples. Block framing is solved. Tran channel segment-0 decoding is solved (byte-exact vs BW's ASCII export across all 5 high-amplitude fixture events). Multi-segment continuation and the Vert / Long / MicL channel decoders are still open. **Production code in `minimateplus/client.py:_decode_a5_waveform` still uses the broken legacy int16 LE decoder.** Sample arrays it writes to the `.h5` sidecars are wrong and must be treated as "unverified" by all downstream consumers. The BW binary write path (`blastware_file.py`) is unaffected — it's pure passthrough and remains byte-perfect. ## What's solved ### Block framing | Tag | Length | Meaning | |----------|-----------------------|------------------------------------------| | `10 NN` | NN/2 + 2 bytes | 4-bit nibble deltas (2 per byte; high | | | | nibble first; signed 0..7 / 8..F = -8..-1)| | `20 NN` | NN + 2 bytes | int8 signed deltas (1 per byte) | | `00 NN` | 2 bytes | RLE: append NN copies of current value | | `30 NN` | NN*2 in data section, | Unknown content. Only in loud-from- | | | NN*4 in trailer | start events. | | `40 02` | 20 bytes (fixed) | Segment header | NN is always a multiple of 4. Implementation: `walk_body()` in `minimateplus/waveform_codec.py`. ### 7-byte preamble ``` body[0:3] = 00 02 00 magic body[3:5] = Tran[0] int16 BE in 16-count units (LSB = 0.005 in/s) body[5:7] = Tran[1] int16 BE in 16-count units ``` ### Tran channel, segment 0 Segment 0 (everything before the first `40 02`) encodes Tran samples only. Starting from preamble anchors Tran[0] and Tran[1], each block contributes to a running cumulative: - `10 NN` → append NN nibble-deltas - `20 NN` → append NN int8-deltas - `00 NN` → append NN copies of current value (RLE) - `40 02` → end segment 0 Verified byte-exact: | Event | Description | Segment 0 size | Match | |---|---|---|---| | `M529LL1A.SP0` | Loud, 0.25 s pretrig | 510 | 510/510 ✓ | | `M529LL1A.SV0` | Loud from sample 0 | 58 | 58/58 ✓ (stops at first `30 NN`) | | `M529LL1A.SS0` | Loud from sample 0 | 42 | 42/42 ✓ (stops at first `30 04`) | | `M529LL1L.JQ0` | Vert-heavy | 510 | 510/510 ✓ | | `M529LL1L.V70` | Mic-heavy (140 dB) | 510 | 510/510 ✓ | Implementation: `decode_tran_initial()`. ### Segment header (`40 02`, 20 bytes total) | Payload offset | Field | Status | |---|---|---| | [0:2] | T_delta at first sample of new segment (int16 BE) | ✅ confirmed | | [2:4] | Likely T_delta at sample seg_start+1 | 🟡 likely | | [4:6] | Unknown (possibly checksum) | ❓ open | | [6:8] | Byte length to next segment header − 2 (uint16 BE) | ✅ confirmed | | [8:12] | Monotonic uint32 LE counter (starts ~0x47) | ✅ confirmed | | [12:14] | Constant `02 00` | ✅ confirmed | | [14:18] | Unknown 4-byte field | ❓ open | ## What's still open 1. **Multi-segment Tran continuation.** After segment 0, applying segment 1's blocks as Tran continuation diverges from truth by sample ~512. Block structure is identical to segment 0 and the per-segment delta budget matches the segment size — but the per- sample trajectory is wrong. 2. **Vert / Long / MicL channel decoders.** No verified decoder for any non-Tran channel. 3. **`30 NN` block content.** Only appears in loud-from-start events. Probably a channel-switch or alternative-encoding marker for high- amplitude regions. Walker steps over it without decoding. ## Strongest unverified hypothesis Segments rotate channels: ``` segment 0 → Tran samples 0..509 segment 1 → Vert samples 0..507 segment 2 → Long samples 0..507 segment 3 → Mic samples 0..507 segment 4 → Tran samples 510..N (continuation) ... ``` This would explain: - Why segment-0 = Tran works perfectly. - Why segment 1 has the same block structure but applying it as Tran continuation gives wrong values. - Why the per-segment delta budget matches the segment size for a *single* channel (508 deltas per segment, not 4 × 508). Not yet verified because the per-channel anchor at segment-start isn't identified in the segment header. Bytes [4:6] and [14:18] of the header are the prime candidates. ## Next experiment — segment-channel scoring analyzer Don't try to hero-code the full decoder. Instead, build a small analysis tool that: 1. For each segment in every fixture event, runs the segment-0 Tran decoder (block-walk + RLE) and produces a cumulative trajectory of 508 deltas. 2. Scores that trajectory against the BW ASCII truth for *each* of {Tran, Vert, Long, MicL} over the segment's sample range, starting from different anchor-byte candidates from the segment header. 3. Reports which (channel, anchor-bytes-location) combination produces the lowest error for each segment. If the rotation hypothesis is right, segment 0 should clearly score best against Tran, segment 1 against Vert, etc. The winning anchor-bytes-location will reveal which segment-header bytes encode the per-segment channel anchors. If the rotation hypothesis is *not* right, the scorer will at least narrow down what segment 1 actually carries. ## Test fixtures Committed under `tests/fixtures/`: - `decode-re-5-8-26/event-a..event-d/`: original quiet bundle (4 events, PPV < 1 in/s). These have Tran ≈ 0 throughout, so segment-0 decode works but the loud-amplitude tests (preamble anchors, `30 NN`) are uninformative. - `5-11-26/M529LL1A.{SP0,SS0,SV0}`: loud bundle (PPV 6-7 in/s on all channels). These cracked the Tran codec. - `5-11-26/M529LL1L.{JQ0,V70}`: targeted captures. JQ0 is Vert-heavy, V70 is Mic-heavy (140 dB). These cracked the `00 NN` RLE rule. Each fixture has a `.TXT` Blastware ASCII export as ground truth. ## Tests `tests/test_waveform_codec.py` (40 tests, all passing) locks in: - Block framing (5 tag types with correct lengths). - Walker contiguity (no gaps or overlaps). - Segment header parsing (counter monotonicity, fixed-pattern check). - `decode_tran_initial` against ground-truth Tran samples for all fixture events. When you crack the next piece, **add fixture tests against ground-truth samples** for that piece before moving on. Don't let unverified code ship without a regression lock-in.