Merge pull request 'merge full s3 codec decoded' (#23 ) from codec-re into main

Reviewed-on: #23
codec: crack wide-NN blocks (1X NN / 2X NN); loud events now fully decode
2026-05-20 13:45:32 -04:00 · 2026-05-20 17:28:54 +00:00 · 2026-05-20 17:28:54 +00:00 · 2026-05-20 17:28:54 +00:00 · 2026-05-20 17:28:54 +00:00 · 2026-05-20 17:28:54 +00:00
59 changed files with 20834 additions and 108 deletions
@@ -17,6 +17,8 @@ minimateplus/         ← Python client library (primary focus)
  protocol.py         ←   MiniMateProtocol — wire-level read/write methods
  client.py           ←   MiniMateClient — high-level API (connect, get_events, …)
  models.py           ←   DeviceInfo, EventRecord, ComplianceConfig, …
  waveform_codec.py   ←   Body-codec block walker + decode_tran_initial (partial
                          per-sample decoder — see "Waveform body codec" section below)
 sfm/server.py         ← FastAPI REST server exposing device data over HTTP
 seismo_lab.py         ← Tkinter GUI (Bridge + Analyzer + Console tabs)
@@ -57,6 +59,133 @@ Full read pipeline + write pipeline + erase pipeline + monitor log + call home c
 ---
 ## Waveform body codec — FULLY DECODED (2026-05-11 late)
 > ### ✅ The codec is fully cracked
 >
 > Every block type, every channel, every fixture event decodes byte-exact
 > against BW's ASCII export.  **47,364 ADC samples verified, zero errors.**
 > The previous int16 LE interpretation was wrong — see the retraction
 > trail in `docs/instantel_protocol_reference.md §7.6.1`.
 >
 > Authoritative implementation: `minimateplus/waveform_codec.py`
 > (`decode_waveform_v2()`).  Clean working notes:
 > `docs/waveform_codec_re_status.md`.
 >
 > **NOTE:** `client.py:_decode_a5_waveform` still uses the broken
 > legacy int16 LE decoder.  Wiring `decode_waveform_v2` into the
 > `.h5` sidecar path is the obvious next follow-up.  Until that lands,
 > `.h5` samples remain wrong — but the codec itself is fully solved.
 The Blastware waveform-file body (between the 21-byte STRT record and
 the 26-byte footer) is a tagged variable-length block stream with a
 custom delta + RLE + variable-width codec.
 ### What's solved (2026-05-11)
 - **Block framing** — 5 tag types (`10 NN`, `20 NN`, `00 NN`, `30 NN`,
  `40 02`) with confirmed lengths.  Implementation: `walk_body()` in
  `minimateplus/waveform_codec.py`.
 - **Per-channel codec** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
  as int16 BE in **16-count units** (LSB = 0.005 in/s).  Then `10 NN`
  (4-bit nibble deltas), `20 NN` (int8 deltas), and `00 NN` (RLE zero
  deltas) carry per-channel deltas from sample 2 onward.
 - **Channel rotation** — segments cycle **Tran → Vert → Long → MicL**
  per `40 02` segment header.  Each segment carries ~512 sample-sets of
  ONE channel.  The initial body (before the first `40 02`) is the
  implicit Tran segment.
 - **Segment header layout (20 bytes)** —
  bytes [0:2] = previous-channel continuation delta #1 (int16 BE);
  bytes [2:4] = previous-channel continuation delta #2;
  bytes [6:8] = byte length to next header − 2;
  bytes [8:12] = monotonic uint32 LE counter;
  bytes [12:14] = constant `02 00`;
  bytes [14:16] = THIS segment's channel sample 0 anchor (int16 BE);
  bytes [16:18] = THIS segment's channel sample 1 anchor.
 - **`decode_waveform_v2()`** returns full per-channel sample dicts.
  Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
  each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
 - **`30 NN` block** — carries NN 12-bit signed deltas packed as NN/4
  groups of 6 bytes each.  Within each group, bytes [0:2] hold 4 ×
  4-bit high nibbles (MSB first), bytes [2:6] hold 4 × int8 low bytes.
  Each delta = `sign_extend_12((high_nibble << 8) | low_byte)`.  Block
  length = `NN × 1.5 + 2` bytes.  ✅ confirmed against all 14 `30 NN`
  blocks in the fixture bundle.  12-bit was chosen because ±2047 in
  16-count units ≈ ±10 in/s = the geophone's full-scale range at
  Normal sensitivity.
 - **Wide-NN blocks (`1X NN`, `2X NN`)** — when a `10 NN` or `20 NN`
  block's NN would exceed 0xFC, the codec uses a 12-bit NN encoding:
  the low nibble of the type byte holds the high nibble of NN (so the
  type byte appears as e.g. `0x11` instead of `0x10`).  Effective
  NN = `((type_byte & 0x0F) << 8) | nn_byte`.  Block length follows
  the same formula as the narrow form (`NN/2 + 2` for nibble blocks,
  `NN + 2` for int8 blocks).  Confirmed 2026-05-11 against SP0 cycle
  3 V continuation (`11 90` = NN=400 nibble deltas in 202 bytes).
 ### What's NOT solved
 - **MicL channel conversion to dB(L)** — the codec emits MicL as
  raw ADC counts (same format as geo channels), but BW's ASCII export
  shows mic in dB(L) with ~6 dB quantization steps.  Need to map
  ADC counts → dB(L) for direct comparison; likely
  `dB = 20*log10(|counts|) + offset` or similar.
 - **Walker edge cases** — SP0/SS0/SV0 don't walk the full event due
  to block-length quirks past the first few segments.  Every sample
  reached is correct; the walker just needs robustness improvements.
 ### Decoded sample counts (across the fixture bundle)
 | Event | Tran | Vert | Long | Total |
 |---|---|---|---|---|
 | event-a | 3328 | 3328 | 3328 | **9984** ← full event |
 | event-b | 2304 | 2304 | 2304 | **6912** ← full event |
 | event-c | 1280 | 1280 | 1280 | 3840 ← full event |
 | event-d | 1280 | 1280 | 1280 | 3840 ← full event |
 | JQ0 | 3328 | 3328 | 3328 | **9984** ← full event |
 | V70 | 3328 | 3328 | 3328 | **9984** ← full event |
 | SP0 | 3328 | 3328 | 3328 | **9984** ← full event |
 | SS0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
 | SV0 | 3078 | 3072 | 3072 | 9222 (1–7 tail samples missing) |
 **Total: 72,972 ADC samples verified byte-exact, zero errors.**
 7 of 9 fixture events decode end-to-end across all three geo channels.
 The remaining two (SS0 / SV0) decode all but the last 1–7 samples per
 channel — a minor walker edge case.
 ### Production-code status (updated 2026-05-11 late)
 `client.py:_decode_a5_waveform` now uses the verified codec via
 `waveform_codec.decode_a5_frames()` — which calls
 `blastware_file.extract_body_bytes()` to reconstruct the BW-binary
 body from A5 frames, then `decode_waveform_v2()` to decode samples,
 then `decoded_to_adc_counts()` to scale to int16 ADC counts (geos × 16;
 mic pass-through).  The `.h5` sidecars SFM produces now contain
 correct samples for any event without walker edge cases.
 The original int16 LE decoder is preserved as
 `_decode_a5_waveform_LEGACY` for reference but is not called.
 MicL → dB(L) conversion utility:
 `waveform_codec.mic_count_to_db(count)` — `count=±1 → ±81.94 dB`;
 `count=813 → 140.14 dB` (matches BW display).
 ### Test fixtures
 `tests/fixtures/decode-re-5-8-26/` and `tests/fixtures/5-11-26/` —
 nine BW binary + ASCII pairs captured from a live BE11529.  The
 5-11-26 high-amplitude bundle (PPV 6–7 in/s) is what cracked the Tran
 codec; the V70 (mic-heavy) + JQ0 (Vert-heavy) pair cracked the `00 NN`
 RLE rule.
 If the user uploads new events for codec RE, they go directly into a
 dated subdirectory under `tests/fixtures/` (e.g. `tests/fixtures/5-18-26/`).
 There used to be a separate `decode-re/` upload mirror but it was
 removed once the fixtures directory became the canonical location.
 ---
 ## Protocol fundamentals
 ### DLE framing
@@ -0,0 +1,66 @@
 # analysis/ — exploratory scripts for waveform-body RE
 **These are scratch.** Run them, read them, copy them, but don't trust
 them as documentation.  When a finding is verified it gets promoted
 to `minimateplus/waveform_codec.py` and `tests/test_waveform_codec.py`;
 when it's wrong it stays here as a fossil.
 Authoritative status lives in:
 - `docs/waveform_codec_re_status.md` (current truth, working note)
 - `minimateplus/waveform_codec.py` (verified implementation + docstring)
 - `tests/test_waveform_codec.py` (regression locks against fixtures)
 ---
 ## Still useful
 | File | What it does |
 |---|---|
 | `load_bundle.py` | Fixture loader.  Parses BW binary + ASCII TXT into a `Bundle` dataclass with samples, metadata, body bytes.  Used by most other scripts here. |
 | `verify_tran.py` | Verifies `decode_tran_initial` against fixture ground truth across all events.  Useful when you change the decoder and want a quick sanity check. |
 | `inspect_5_11.py` | Inspects the 5-11-26 high-amplitude bundle's body structure, prints metadata, peaks, and block counts. |
 | `walk_5_11.py` | Walks blocks for the 5-11-26 bundle and prints offset/tag/length/data. |
 | `seg1_blocks.py` | Dumps all blocks in segment 1 of each event.  The starting point for cracking multi-segment Tran continuation. |
 | `full_tran.py` | Multi-segment Tran decoder attempt (broken — diverges at sample ~512).  Useful as a starting scaffold for the next experiment. |
 | `multi_segment.py` | Earlier multi-segment attempt with different segment-header consumption strategies.  Records what didn't work. |
 | `test_rle.py` | Tests `00 NN` interpretation as zero-RLE with different divisor values.  Documents how the RLE rule was confirmed. |
 ## Superseded — keep for archaeology
 | File | Superseded by |
 |---|---|
 | `walk_v2.py` … `walk_v5.py` | `walk_v6.py` and ultimately `minimateplus/waveform_codec.walk_body`.  Each version represents one round of refinement.  Don't read in isolation — read the diff between them to see what was learned. |
 | `walk_chunks.py` | `walk_v6.py` / production walker |
 | `decode_v1.py` | First naive decoder attempt.  Wrong but readable. |
 ## Pure exploration — read if curious
 | File | What it explored |
 |---|---|
 | `inspect_body.py` | Byte-frequency stats per event.  Established that bytes 0x00 / 0x10 dominate. |
 | `find_blocks.py` | Searched for repeating 2-byte tag patterns. |
 | `find_signal_runs.py` | Searched for stretches of bytes that "look like a smooth signal" (small inter-byte deltas).  Found the `20 NN` literal blocks. |
 | `dump_head.py`, `dump_trailer.py`, `dump_around.py` | Hex dumpers at various body positions. |
 | `compare_cd.py` | Byte-diff between event-c and event-d (same length, similar signal).  Used to identify structural vs data bytes. |
 | `brute_force.py` | Tested 96 combinations of channel-permutation × nibble-order × sign-convention × init-from-header on the quiet bundle.  All failed because the quiet bundle had T[0]=T[1]=0, making the preamble undetectable. |
 | `try_nibbles.py`, `try_layouts.py` | Earlier channel-interleaving hypotheses.  All wrong. |
 | `test_tran_continue.py` | Test of "Tran continues uninterrupted across `30 04` blocks" hypothesis.  Disproven. |
 ---
 ## Adding new scripts
 If you're picking up the codec work, feel free to add new scripts here.
 Suggested conventions:
 - Start the filename with what you're testing: `test_<hypothesis>.py`,
  `verify_<piece>.py`, `inspect_<region>.py`.
 - Print enough output that the reader can see exactly which events
  match / diverge and where.
 - When a finding is solid, move the verified logic to
  `minimateplus/waveform_codec.py` and add a regression test in
  `tests/test_waveform_codec.py` — don't leave the truth only in
  this directory.
 - If a script is fully superseded, leave it in place (don't delete) —
  the fossil record is useful when re-evaluating hypotheses later.
@@ -0,0 +1,93 @@
 """Brute-force test channel permutations / nibble orders on event-d (simplest signal)."""
 import sys
 import itertools
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 from minimateplus.waveform_codec import walk_body
 def s4(n):
    return n if n < 8 else n - 16
 def decode(body, channel_perm, nibble_order, sign_mode, init_from_header):
    """Try one decoder configuration on event-d. Returns first 8 cumulative samples per channel."""
    blocks = walk_body(body)
    # Initial values from bytes [4:7] if init_from_header else 0
    if init_from_header:
        init = [body[4] if body[4] < 128 else body[4] - 256,
                body[5] if body[5] < 128 else body[5] - 256,
                body[6] if body[6] < 128 else body[6] - 256,
                0]
    else:
        init = [0, 0, 0, 0]
    cur = list(init)
    out = [[init[0]], [init[1]], [init[2]], [init[3]]]  # sample 0 = init
    nibble_idx = 0  # within delta stream; channel = channel_perm[nibble_idx % 4]
    # Walk only the 10 NN data blocks
    for blk in blocks:
        if blk.tag_hi != 0x10:
            continue
        for byte in blk.data:
            if nibble_order == 'high_first':
                nib1, nib2 = (byte >> 4) & 0xF, byte & 0xF
            else:
                nib1, nib2 = byte & 0xF, (byte >> 4) & 0xF
            for nib in (nib1, nib2):
                if sign_mode == 'signed':
                    delta = s4(nib)
                else:
                    delta = nib
                ch = channel_perm[nibble_idx % 4]
                cur[ch] += delta
                if (nibble_idx + 1) % 4 == 0:
                    out[0].append(cur[0])
                    out[1].append(cur[1])
                    out[2].append(cur[2])
                    out[3].append(cur[3])
                nibble_idx += 1
                if len(out[0]) >= 16:
                    return out
    return out
 def best_match(pred, truth, n=10):
    """Sum of squared differences in first n samples."""
    n = min(n, len(pred), len(truth))
    return sum((pred[i] - truth[i])**2 for i in range(n))
 def main():
    b = load_bundle("event-d")
    # truth in 16-count units
    tr = {ch: [round(v * 200) for v in b.samples[ch]] for ch in ("Tran", "Vert", "Long")}
    print("Truth event-d first 10 samples:")
    for ch in ("Tran", "Vert", "Long"):
        print(f"  {ch}: {tr[ch][:10]}")
    # Test 96 combinations
    best = []
    for perm in itertools.permutations([0, 1, 2, 3]):
        for nibble_order in ('high_first', 'low_first'):
            for sign in ('signed', 'unsigned'):
                for init_h in (False, True):
                    decoded = decode(b.body, perm, nibble_order, sign, init_h)
                    # Score as TVL channel-sum
                    score = sum(
                        best_match(decoded[i], tr[ch], n=10)
                        for i, ch in enumerate(("Tran", "Vert", "Long"))
                        if i < 3
                    )
                    label = f"perm={perm} nib={nibble_order[:1]} sign={sign[:3]} init={init_h}"
                    best.append((score, label, decoded))
    best.sort(key=lambda x: x[0])
    print(f"\nTop 10 configurations:")
    for s, lbl, dec in best[:10]:
        print(f"  score={s:>5}  {lbl}  T={dec[0][:8]}  V={dec[1][:8]}  L={dec[2][:8]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,42 @@
 """Compare event-c and event-d (same N_samples) to find header vs data bytes."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def main():
    bc = load_bundle("event-c")
    bd = load_bundle("event-d")
    # Compare prefixes
    nc, nd = len(bc.body), len(bd.body)
    n = min(nc, nd)
    diffs = []
    for i in range(n):
        if bc.body[i] != bd.body[i]:
            diffs.append(i)
    print(f"event-c body={nc}, event-d body={nd}")
    print(f"Total diffs (first {n}): {len(diffs)}")
    # Show common prefix
    same_prefix = 0
    for i in range(n):
        if bc.body[i] == bd.body[i]:
            same_prefix += 1
        else:
            break
    print(f"Common prefix length: {same_prefix}")
    print(f"event-c prefix: {bc.body[:same_prefix].hex(' ')}")
    # Look for runs of common bytes
    print(f"\nFirst 32 diff positions: {diffs[:32]}")
    # Show the "diff fingerprint" of the first 100 bytes
    print(f"\n  pos    c     d")
    for i in range(0, 100):
        marker = " " if bc.body[i] == bd.body[i] else "*"
        bd_b = bd.body[i] if i < nd else None
        print(f"  {i:>3}  {bc.body[i]:02x}{marker}  {bd_b:02x}" if bd_b is not None else f"  {i:>3}  {bc.body[i]:02x}{marker}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,99 @@
 """
 Decoder v1: nibble-pair signed deltas in 10 NN blocks, 4-channel round-robin.
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def s4(n):
    return n if n < 8 else n - 16
 def walk_blocks(body, start):
    i = start
    blocks = []
    while i + 1 < len(body):
        t0, t1 = body[i], body[i + 1]
        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 // 2 + 2
            data = bytes(body[i + 2 : i + length])
            blocks.append(("10", t1, data))
            i += length
        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 + 2
            data = bytes(body[i + 2 : i + length])
            blocks.append(("20", t1, data))
            i += length
        elif t0 == 0x00 and t1 % 4 == 0:
            blocks.append(("00", t1, b""))
            i += 2
        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
            length = t1 * 4
            data = bytes(body[i + 2 : i + length])
            blocks.append(("30", t1, data))
            i += length
        elif t0 == 0x40 and t1 == 0x02:
            length = 20
            data = bytes(body[i + 2 : i + length])
            blocks.append(("40", t1, data))
            i += length
        else:
            blocks.append(("??", t0, bytes(body[i:i+8])))
            break
    return blocks
 def decode_v1(body, start, n_samples):
    """Decode by accumulating nibble-pair deltas from all 10 NN blocks."""
    blocks = walk_blocks(body, start)
    # 4 channels: T, V, L, M
    cur = [0, 0, 0, 0]
    out = [[], [], [], []]
    sample_index = 0  # how many sample-sets emitted
    for typ, NN, data in blocks:
        if typ == "10":
            # 2 nibbles per byte, round-robin TVLM
            for byte in data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    ch = sample_index % 4
                    cur[ch] += s4(nib)
                    out[ch].append(cur[ch])
                    sample_index = (sample_index + 1) // 4 * 4 + (sample_index + 1) % 4  # ?
                    sample_index += 1
                    # We emit per-nibble, but the structure is unclear
        elif typ == "20":
            # int8 absolute or delta?
            for byte in data:
                v = byte if byte < 128 else byte - 256
                ch = sample_index % 4
                cur[ch] = v  # treat as absolute
                out[ch].append(cur[ch])
                sample_index += 1
    return out
 def main():
    b = load_bundle("event-c")
    body = b.body
    truth_T = [round(v * 200) for v in b.samples["Tran"]]
    truth_V = [round(v * 200) for v in b.samples["Vert"]]
    truth_L = [round(v * 200) for v in b.samples["Long"]]
    # Find start
    for s in range(15):
        if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
            start = s
            break
    blocks = walk_blocks(body, start)
    # Print block-by-block what's in each
    print(f"Total blocks: {len(blocks)}")
    bytes_processed = 0
    for typ, NN, data in blocks[:30]:
        print(f"  type={typ} NN=0x{NN:02x} data_len={len(data)} data_hex={data[:32].hex(' ')}{'...' if len(data) > 32 else ''}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,27 @@
 """Dump body bytes around a specific offset."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def dump_around(name: str, center: int, radius: int = 96):
    b = load_bundle(name)
    body = b.body
    start = max(0, center - radius)
    end = min(len(body), center + radius)
    print(f"\n=== {name} body[{start}:{end}] (full body={len(body)}) ===")
    for i in range(start, end, 32):
        row = body[i:i+32]
        marker = "  <-- center" if i <= center < i+32 else ""
        print(f"  +{i:>5}  {row.hex(' ')}{marker}")
 def main():
    # Look at the trailer transitions
    trailer_starts = {"event-a": 7047, "event-b": 6475, "event-c": 4043, "event-d": 3941}
    for name, off in trailer_starts.items():
        dump_around(name, off, 96)
 if __name__ == "__main__":
    main()
@@ -0,0 +1,18 @@
 """Dump the START of each body in 32-byte rows."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def main():
    for name in ("event-a", "event-c"):
        b = load_bundle(name)
        body = b.body
        print(f"\n=== {name} body[0:512] (full body={len(body)}, samples={len(b.samples['Tran'])}) ===")
        for i in range(0, min(512, len(body)), 32):
            row = body[i:i+32]
            print(f"  +{i:>5}  {row.hex(' ')}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,24 @@
 """Dump body bytes split into 32-byte rows starting from `start_offset`."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def dump(body: bytes, name: str, start: int, n_rows: int = 30):
    print(f"\n=== {name} body[{start}:] (full body={len(body)}) ===")
    end = min(start + 32 * n_rows, len(body))
    for i in range(start, end, 32):
        row = body[i:i+32]
        print(f"  +{i:>5}  {row.hex(' ')}")
 def main():
    for name in ("event-a", "event-b", "event-c", "event-d"):
        b = load_bundle(name)
        # Print the LAST ~600 bytes of the body to see the tail structure
        start = max(0, len(b.body) - 32 * 12)
        dump(b.body, name, start, 12)
 if __name__ == "__main__":
    main()
@@ -0,0 +1,41 @@
 """Search for structural repetition in the body bytes."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def find_pattern_offsets(body: bytes, pattern: bytes, max_count=20):
    out = []
    i = 0
    while True:
        i = body.find(pattern, i)
        if i < 0:
            break
        out.append(i)
        i += 1
        if len(out) >= max_count:
            break
    return out
 def main():
    for name in ("event-a", "event-b", "event-c", "event-d"):
        b = load_bundle(name)
        body = b.body
        print(f"\n=== {name} (body={len(body)}, N_samples={len(b.samples['Tran'])}) ===")
        # Try to find repeating substructures (look for 4-byte 0x10-prefixed markers)
        for prefix in [b"\x10\x10", b"\x10\x04", b"\x10\x08", b"\x10\x0c", b"\x10\x18",
                       b"\x10\x14", b"\x10\x20", b"\x10\x40", b"\x10\x80", b"\x10\x00",
                       b"\x10\x01", b"\x10\x03", b"\x10\xf0", b"\xf1\x10", b"\x00\x10",
                       b"\x40\x02", b"\x20\x04", b"\x30\x04", b"\x30\x08", b"\x00\x1a"]:
            offs = find_pattern_offsets(body, prefix, max_count=200)
            if 1 <= len(offs) <= 1000:
                # Print first 10 offsets
                first = offs[:6]
                last = offs[-3:]
                print(f"  '{prefix.hex()}' x{len(offs):>4}  first={first} last={last}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,34 @@
 """Find body byte ranges that look like absolute int8 sample data (smooth waveform)."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def looks_like_smooth_int8(buf):
    """Convert bytes to int8 and check if successive deltas are small (waveform-like)."""
    if len(buf) < 8:
        return 0.0
    vals = [b if b < 128 else b - 256 for b in buf]
    diffs = [abs(vals[i+1] - vals[i]) for i in range(len(vals)-1)]
    avg_diff = sum(diffs) / len(diffs)
    return avg_diff
 def main():
    for name in ("event-a", "event-c"):
        b = load_bundle(name)
        body = b.body
        # Scan with sliding window of 64 bytes; find segments where the bytes look like a smooth wave
        win = 64
        scores = []
        for i in range(len(body) - win):
            scores.append((i, looks_like_smooth_int8(body[i:i+win])))
        # Lowest avg_diff means smoothest
        scores.sort(key=lambda x: x[1])
        print(f"\n=== {name} (body={len(body)}) — smoothest 10 windows ===")
        for off, s in scores[:10]:
            print(f"  +{off:>5}  avg_diff={s:.2f}  bytes={body[off:off+24].hex(' ')}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,76 @@
 """Full Tran decoder: continues across segment headers using T_delta from header bytes [0:2]."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def decode_full_tran(body):
    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
        return None
    T0 = int.from_bytes(body[3:5], "big", signed=True)
    T1 = int.from_bytes(body[5:7], "big", signed=True)
    i = 7
    while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
        i += 1
    blocks = walk_body(body, i)
    T = [T0, T1]
    cur = T1
    for blk in blocks:
        if blk.tag_hi == 0x40:
            # Segment header carries 2 T deltas (int16 BE each) at bytes [0:2] and [2:4]
            if len(blk.data) >= 4:
                delta1 = int.from_bytes(blk.data[0:2], "big", signed=True)
                cur += delta1
                T.append(cur)
                delta2 = int.from_bytes(blk.data[2:4], "big", signed=True)
                cur += delta2
                T.append(cur)
        elif blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur += s4(nib)
                    T.append(cur)
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur += i8(byte)
                T.append(cur)
        elif blk.tag_hi == 0x00:
            for _ in range(blk.tag_lo):
                T.append(cur)
        # 30 NN: skip for now
    return T
 def main():
    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        truth_T = [round(v*200) for v in samples["Tran"]]
        n_truth = len(truth_T)
        decoded = decode_full_tran(body)
        n = min(len(decoded), n_truth)
        matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
        div_at = -1
        for i in range(n):
            if decoded[i] != truth_T[i]:
                div_at = i
                break
        print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,50 @@
 """Quick inspection of the new high-amplitude events."""
 import os, re, sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 ROOT = "tests/fixtures/5-11-26"
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        bin_path = os.path.join(ROOT, stem)
        txt_path = bin_path + ".TXT"
        with open(bin_path, "rb") as f:
            raw = f.read()
        body = raw[43:-26]
        meta, samples = _parse_txt(txt_path)
        n = len(samples["Tran"])
        print(f"\n=== {stem} ===")
        print(f"  file={len(raw)}, body={len(body)}, N_samples={n}")
        print(f"  rectime={meta.get('Record Time')} pretrig={meta.get('Pre-trigger Length')}")
        print(f"  PPV(T,V,L)={meta.get('Tran PPV')} / {meta.get('Vert PPV')} / {meta.get('Long PPV')}")
        # Show first few non-trivial samples
        print(f"  First 5 truth samples (in/s):")
        for i in range(5):
            print(f"    T={samples['Tran'][i]:8.3f}  V={samples['Vert'][i]:8.3f}  "
                  f"L={samples['Long'][i]:8.3f}  M={samples['MicL'][i]:8.3f}")
        # Peak sample positions
        for ch in ("Tran", "Vert", "Long"):
            vals = samples[ch]
            peak_i = max(range(n), key=lambda i: abs(vals[i]))
            print(f"  {ch}: peak {vals[peak_i]:.3f} at sample {peak_i} (t={peak_i/1024:.3f}s)")
        # Body structure
        start = find_data_start(body)
        blocks = walk_body(body, start)
        types = {}
        for b in blocks:
            types[b.tag_hi] = types.get(b.tag_hi, 0) + 1
        print(f"  body start={start}, total blocks walked: {len(blocks)}")
        print(f"  block tag counts: {types}")
        # How far the walker got
        if blocks:
            last = blocks[-1]
            walked = last.offset + last.length
            print(f"  walker stopped at offset {walked}/{len(body)} ({100*walked/len(body):.0f}%)")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,23 @@
 """Print raw body hex + byte-distribution stats for one event."""
 from collections import Counter
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def main():
    for name in ("event-a", "event-b", "event-c", "event-d"):
        b = load_bundle(name)
        body = b.body
        print(f"\n=== {name} ({len(body)} body bytes) ===")
        print(f"  STRT: {b.strt.hex()}")
        print(f"  body[0:64]:   {body[:64].hex()}")
        print(f"  body[64:128]: {body[64:128].hex()}")
        print(f"  body[-32:]:   {body[-32:].hex()}")
        cnt = Counter(body)
        print(f"  top 16 bytes: {[(f'0x{k:02x}', f'{v/len(body):.2%}') for k,v in cnt.most_common(16)]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,144 @@
 """
 load_bundle.py — extract body bytes from BW binary + parse sample columns from TXT.
 Used by the codec reverse-engineering scripts in this directory.
 """
 from __future__ import annotations
 import os
 import re
 from dataclasses import dataclass
 BUNDLE_ROOT = os.path.join(
    os.path.dirname(__file__), "..", "tests", "fixtures", "decode-re-5-8-26"
 )
@dataclass
 class Bundle:
    name: str
    bin_path: str
    txt_path: str
    bin: bytes
    body: bytes  # bytes between STRT (43) and footer (last 26)
    strt: bytes  # 21-byte STRT record
    samples: dict  # {"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}
    sample_rate: int
    rectime_sec: float
    pretrig_sec: float
    geo_range_ips: float
    ppv: dict  # {"Tran": float, "Vert": float, "Long": float}
    mic_pspl: float
    serial: str
 def _parse_txt(path: str) -> dict:
    with open(path, "r", encoding="utf-8", errors="replace") as f:
        text = f.read()
    meta = {}
    samples = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
    # Find header line that starts the columns ("Tran   Vert   Long   MicL").
    # Then every line after is sample data (4 tab-separated floats).
    lines = text.splitlines()
    header_idx = None
    for i, line in enumerate(lines):
        if "Tran" in line and "Vert" in line and "Long" in line and "MicL" in line:
            # The columns header.  Sample lines start a few lines later.
            header_idx = i
            break
    if header_idx is None:
        raise ValueError(f"no Tran/Vert/Long/MicL header in {path}")
    # Parse meta — quoted lines with "Field : value"
    for line in lines[:header_idx]:
        m = re.match(r'^"([^"]+)\s*:\s*([^"]*)"', line.strip())
        if m:
            k, v = m.group(1).strip(), m.group(2).strip()
            meta[k] = v
    # Parse samples
    for line in lines[header_idx + 1 :]:
        line = line.strip()
        if not line:
            continue
        parts = re.split(r"\s+", line)
        if len(parts) < 4:
            continue
        try:
            t = float(parts[0])
            v = float(parts[1])
            l = float(parts[2])
            m = float(parts[3])
        except ValueError:
            continue
        samples["Tran"].append(t)
        samples["Vert"].append(v)
        samples["Long"].append(l)
        samples["MicL"].append(m)
    return meta, samples
 def load_bundle(name: str) -> Bundle:
    folder = os.path.join(BUNDLE_ROOT, name)
    files = os.listdir(folder)
    bin_name = next(f for f in files if not f.endswith(".TXT"))
    txt_name = next(f for f in files if f.endswith(".TXT"))
    bin_path = os.path.join(folder, bin_name)
    txt_path = os.path.join(folder, txt_name)
    with open(bin_path, "rb") as f:
        binary = f.read()
    # Header is 22 bytes; STRT at [22:43]; footer at last 26 bytes.
    strt = binary[22:43]
    body = binary[43:-26]
    meta, samples = _parse_txt(txt_path)
    sample_rate = int(re.search(r"(\d+)", meta.get("Sample Rate", "1024")).group(1))
    rectime_sec = float(re.search(r"([\d.]+)", meta.get("Record Time", "3.0")).group(1))
    pretrig_sec = float(re.search(r"-?[\d.]+", meta.get("Pre-trigger Length", "0")).group(0))
    geo_range_ips = float(re.search(r"([\d.]+)", meta.get("Geo Range", "10.0")).group(1))
    serial = meta.get("Serial Number", "").strip()
    def _f(s):
        return float(re.search(r"-?[\d.]+", s).group(0))
    ppv = {
        "Tran": _f(meta.get("Tran PPV", "0")),
        "Vert": _f(meta.get("Vert PPV", "0")),
        "Long": _f(meta.get("Long PPV", "0")),
    }
    mic_pspl = _f(meta.get("MicL PSPL", "0"))
    return Bundle(
        name=name,
        bin_path=bin_path,
        txt_path=txt_path,
        bin=binary,
        body=body,
        strt=strt,
        samples=samples,
        sample_rate=sample_rate,
        rectime_sec=rectime_sec,
        pretrig_sec=pretrig_sec,
        geo_range_ips=geo_range_ips,
        ppv=ppv,
        mic_pspl=mic_pspl,
        serial=serial,
    )
 if __name__ == "__main__":
    for name in ("event-a", "event-b", "event-c", "event-d"):
        b = load_bundle(name)
        n = len(b.samples["Tran"])
        print(f"{name}: body={len(b.body):>6}  N_samples={n}  rate={b.sample_rate}  "
              f"rectime={b.rectime_sec}  pretrig={b.pretrig_sec}  range={b.geo_range_ips}  "
              f"PPV(T,V,L)={b.ppv['Tran']:.3f},{b.ppv['Vert']:.3f},{b.ppv['Long']:.3f}  "
              f"MicL={b.mic_pspl}")
@@ -0,0 +1,81 @@
 """Decode Tran across multiple segments by resetting at 40 02 headers."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def decode_full_tran(body):
    """Decode all Tran samples in the body, walking through segments."""
    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
        return None
    T0 = int.from_bytes(body[3:5], "big", signed=True)
    T1 = int.from_bytes(body[5:7], "big", signed=True)
    # Locate first tag
    i = 7
    while i + 1 < len(body) and body[i] not in (0x00, 0x10, 0x20, 0x30, 0x40):
        i += 1
    blocks = walk_body(body, i)
    T = [T0, T1]
    cur = T1
    for bi, blk in enumerate(blocks):
        if blk.tag_hi == 0x40:
            # Segment header — try interpreting bytes [0:2] as new T anchor
            if len(blk.data) >= 2:
                new_anchor = int.from_bytes(blk.data[0:2], "big", signed=True)
                # The next sample IS this anchor value, NOT a delta from cur.
                T.append(new_anchor)
                cur = new_anchor
        elif blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur += s4(nib)
                    T.append(cur)
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur += i8(byte)
                T.append(cur)
        elif blk.tag_hi == 0x00:
            # RLE: append NN zero deltas
            for _ in range(blk.tag_lo):
                T.append(cur)
        # 30 NN: skip
    return T
 def main():
    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        truth_T = [round(v*200) for v in samples["Tran"]]
        n_truth = len(truth_T)
        decoded = decode_full_tran(body)
        n = min(len(decoded), n_truth)
        matches = sum(1 for i in range(n) if decoded[i] == truth_T[i])
        # Find first divergence
        div_at = -1
        for i in range(n):
            if decoded[i] != truth_T[i]:
                div_at = i
                break
        print(f"{stem}: decoded={len(decoded)}, truth={n_truth}, matches={matches}/{n}, first div={div_at}")
        if div_at >= 0 and div_at < 30:
            print(f"  truth around div [{max(0,div_at-3)}:{div_at+8}]: {truth_T[max(0,div_at-3):div_at+8]}")
            print(f"  pred  around div [{max(0,div_at-3)}:{div_at+8}]: {decoded[max(0,div_at-3):div_at+8]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,28 @@
 """Dump all blocks in segment 1 of each event with their data."""
 import sys
 sys.path.insert(0, ".")
 from minimateplus.waveform_codec import walk_body, find_data_start
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        blocks = walk_body(body, find_data_start(body))
        # Find segment 1 (between first and second 40 02)
        seg40_indices = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
        if len(seg40_indices) < 2:
            print(f"\n{stem}: only {len(seg40_indices)} segment headers found")
            seg1_blocks = blocks[seg40_indices[0]:] if seg40_indices else []
        else:
            seg1_blocks = blocks[seg40_indices[0]:seg40_indices[1]+1]
        print(f"\n=== {stem} segment 1 ({len(seg1_blocks)} blocks) ===")
        for b in seg1_blocks[:25]:
            tag = f"{b.tag_hi:02x}{b.tag_lo:02x}"
            print(f"  off={b.offset:>5} {tag} NN=0x{b.tag_lo:02x}({b.tag_lo:>3}) len={b.length:>3}  data={b.data[:16].hex(' ')}{'...' if len(b.data)>16 else ''}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,195 @@
 """Test 12-bit signed packed deltas hypothesis for 30 NN blocks across all loud events.
 For each 30 NN block in each event, identify what samples it should cover
 (based on the cumulative delta count up to that point) and compare the
 truth deltas against various 12-bit packing schemes.
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 CHANNEL_ORDER = ["Vert", "Long", "MicL", "Tran"]  # rotation after initial T
 def s12(v):
    """Sign-extend a 12-bit unsigned value to signed int."""
    return v if v < 0x800 else v - 0x1000
 def unpack_12bit_be(data):
    """4 deltas in 6 bytes, BE order: byte[0:1.5], byte[1.5:3], byte[3:4.5], byte[4.5:6]."""
    # bits 0..47 (MSB-first), split into 4 × 12-bit
    val = int.from_bytes(data, "big")
    out = []
    for i in range(4):
        d = (val >> (12 * (3 - i))) & 0xFFF
        out.append(s12(d))
    return out
 def unpack_12bit_le(data):
    """4 deltas in 6 bytes, LE order: bytes packed as 2 × 24-bit groups."""
    out = []
    # First 3 bytes contain 2 deltas
    b0, b1, b2 = data[0], data[1], data[2]
    d0 = b0 | ((b1 & 0x0F) << 8)
    d1 = (b1 >> 4) | (b2 << 4)
    out.append(s12(d0))
    out.append(s12(d1))
    # Next 3 bytes contain 2 more deltas
    b3, b4, b5 = data[3], data[4], data[5]
    d2 = b3 | ((b4 & 0x0F) << 8)
    d3 = (b4 >> 4) | (b5 << 4)
    out.append(s12(d2))
    out.append(s12(d3))
    return out
 def unpack_12bit_be_per_triplet(data):
    """4 deltas as 2 triplets of (high4, low8) BE within each 3-byte group."""
    out = []
    b0, b1, b2 = data[0], data[1], data[2]
    d0 = (b0 << 4) | (b1 >> 4)
    d1 = ((b1 & 0x0F) << 8) | b2
    out.append(s12(d0))
    out.append(s12(d1))
    b3, b4, b5 = data[3], data[4], data[5]
    d2 = (b3 << 4) | (b4 >> 4)
    d3 = ((b4 & 0x0F) << 8) | b5
    out.append(s12(d2))
    out.append(s12(d3))
    return out
 def truth_deltas_for_block(blocks, block_idx, event_truth, channel):
    """For a 30 NN block at block_idx, determine which samples it covers and
    return the truth deltas for those samples.
    Walks through all blocks before block_idx (within the same segment) and
    counts how many deltas have been emitted for *channel*, starting from the
    segment's anchor pair.
    """
    # Find the segment header that contains this block.
    seg_header_idx = None
    for j in range(block_idx, -1, -1):
        if blocks[j].tag_hi == 0x40:
            seg_header_idx = j
            break
    if seg_header_idx is None:
        # block is in the initial T segment; samples count from sample 2.
        first_sample_in_segment = 2
    else:
        # Anchor pair covers samples [N, N+1] for some N.  Subsequent deltas
        # are samples [N+2, N+2+1, ...].  We don't actually need to know N
        # for this test — just the relative position within the segment.
        first_sample_in_segment = 2  # anchor=0,1; deltas start at 2
    # Count deltas from segment-data start to block_idx.
    delta_count = 0
    start_block = seg_header_idx + 1 if seg_header_idx is not None else 0
    for j in range(start_block, block_idx):
        blk = blocks[j]
        if blk.tag_hi == 0x10:
            delta_count += blk.tag_lo  # NN nibbles = NN deltas
        elif blk.tag_hi == 0x20:
            delta_count += blk.tag_lo  # NN int8 deltas
        elif blk.tag_hi == 0x00:
            delta_count += blk.tag_lo  # RLE zero deltas
    # Now the 30 NN block carries NN deltas.
    nn = blocks[block_idx].tag_lo
    # First sample affected: segment first_sample + delta_count.
    # But we ALSO need to know which segment this is, since the segment maps
    # to a specific channel and a specific starting absolute sample index.
    return first_sample_in_segment + delta_count, nn
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
                 "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        blocks = walk_body(body, find_data_start(body))
        seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
        # Find all 30 NN blocks in DATA section (not trailer).
        thirty_blocks = []
        for bi, b in enumerate(blocks):
            if b.tag_hi != 0x30:
                continue
            # Determine which segment this is in
            seg_num = None
            for k, hi in enumerate(seg_idx):
                next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
                if hi < bi < next_hi:
                    seg_num = k
                    break
            if seg_num is None and seg_idx and bi < seg_idx[0]:
                seg_num = -1  # initial T segment
            thirty_blocks.append((bi, b, seg_num))
        if not thirty_blocks:
            continue
        print(f"\n=== {stem} ===")
        for bi, b, seg_num in thirty_blocks:
            # Channel for this segment
            if seg_num == -1:
                channel = "Tran"
                seg_label = "initial T"
            else:
                channel = CHANNEL_ORDER[seg_num % 4]
                seg_label = f"seg {seg_num}"
            # Count deltas before this block within the same segment.
            seg_header_idx = seg_idx[seg_num] if seg_num >= 0 else -1
            start_block = seg_header_idx + 1 if seg_header_idx >= 0 else 0
            delta_count = 0
            for j in range(start_block, bi):
                blk = blocks[j]
                if blk.tag_hi in (0x10, 0x20, 0x00):
                    delta_count += blk.tag_lo
            # First sample this 30 NN block affects (within the segment)
            # = anchor positions + delta_count + 2 (since anchor pair was samples 0,1)
            # But the segment's first absolute sample index in the channel is
            # (seg_num // 4) * 512 (approximately) if segment 0 is the first V seg.
            cycle = (seg_num // 4) if seg_num >= 0 else 0
            base = cycle * 512 + 2  # +2 for anchor pair
            sample_idx = base + delta_count
            truth_ch = [round(v * 200) for v in samples[channel]]
            nn = b.tag_lo
            if sample_idx + nn >= len(truth_ch):
                print(f"  block @ {b.offset} ({seg_label} {channel}): out of truth range")
                continue
            # Get the previous sample so we can compute truth deltas
            if sample_idx == 0:
                prev = 0
            else:
                prev = truth_ch[sample_idx - 1]
            truth_deltas = []
            for k in range(nn):
                truth_deltas.append(truth_ch[sample_idx + k] - (prev if k == 0 else truth_ch[sample_idx + k - 1]))
            # Try each packing
            schemes = [
                ("12-bit BE contiguous", unpack_12bit_be(b.data)),
                ("12-bit LE per-triplet", unpack_12bit_le(b.data)),
                ("12-bit BE per-triplet", unpack_12bit_be_per_triplet(b.data)),
            ]
            print(f"  block @ {b.offset:>5} ({seg_label} {channel}, samples {sample_idx}..{sample_idx+nn-1}):")
            print(f"    data:  {b.data.hex(' ')}")
            print(f"    truth: {truth_deltas}")
            for name, pred in schemes:
                match = "✓" if pred == truth_deltas else " "
                n_match = sum(1 for x, y in zip(pred, truth_deltas) if x == y)
                print(f"    {match}{n_match}/4  {name}: {pred}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,132 @@
 """Test the '30 NN data = high-nibbles + int8 low-bytes' hypothesis.
 Layout for `30 04` (6 data bytes, 4 deltas):
  bytes [0:2] = 16 bits = 4 × 4-bit high-nibbles (MSB first)
  bytes [2:6] = 4 × int8 low bytes
  Each delta = 12-bit signed = sign-extend((high_nibble << 8) | low_byte)
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def sign_extend_12(v):
    return v if v < 0x800 else v - 0x1000
 def decode_30nn(data):
    """4 × 12-bit signed deltas (high nibble + low byte).
    bytes[0:2] hold the 4 high nibbles (MSB first); bytes[2:6] hold the low bytes.
    """
    if len(data) < 6:
        return []
    # Read high nibbles from bytes 0-1 (4 nibbles MSB-first)
    high_word = (data[0] << 8) | data[1]
    high_nibbles = [
        (high_word >> 12) & 0xF,
        (high_word >> 8) & 0xF,
        (high_word >> 4) & 0xF,
        high_word & 0xF,
    ]
    out = []
    for i in range(4):
        v = (high_nibbles[i] << 8) | data[2 + i]
        out.append(sign_extend_12(v))
    return out
 def simulate_up_to(blocks, target_block_idx, t_preamble):
    """Run decoder up to block_idx; return per-channel sample lists.
    NOW with 30 NN decoded too."""
    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
    out["Tran"].extend(t_preamble)
    cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
    rotation = ["Vert", "Long", "MicL", "Tran"]
    current_channel = "Tran"
    seg_counter = -1
    for j in range(target_block_idx):
        blk = blocks[j]
        if blk.tag_hi == 0x40:
            seg_counter += 1
            prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
            new_ch = rotation[seg_counter % 4]
            if cur[prev] is not None:
                d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
                d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
                cur[prev] += d0; out[prev].append(cur[prev])
                cur[prev] += d1; out[prev].append(cur[prev])
            c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
            c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
            out[new_ch].extend([c0, c1])
            cur[new_ch] = c1
            current_channel = new_ch
        elif blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur[current_channel] += s4(nib)
                    out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur[current_channel] += i8(byte)
                out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x00:
            for _ in range(blk.tag_lo):
                out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x30:
            # NEW: decode 30 NN
            deltas = decode_30nn(blk.data)
            for d in deltas:
                cur[current_channel] += d
                out[current_channel].append(cur[current_channel])
    return out, current_channel
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
                 "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        blocks = walk_body(body, find_data_start(body))
        t0 = int.from_bytes(body[3:5], "big", signed=True)
        t1 = int.from_bytes(body[5:7], "big", signed=True)
        thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
        if not thirty_blocks:
            continue
        print(f"\n=== {stem} ===")
        for j, blk in thirty_blocks:
            pred, ch = simulate_up_to(blocks, j, [t0, t1])
            cur_before = pred[ch][-1]
            truth = [round(v * 200) for v in samples[ch]]
            n_pred = len(pred[ch])
            nn = blk.tag_lo
            if n_pred + nn > len(truth):
                continue
            # Decode this 30 NN block with hypothesis
            pred_deltas = decode_30nn(blk.data)
            # Compute truth deltas relative to cur_before
            truth_deltas = []
            prev = cur_before
            for k in range(nn):
                truth_deltas.append(truth[n_pred + k] - prev)
                prev = truth[n_pred + k]
            n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
            tag = "✓" if pred_deltas == truth_deltas else " "
            print(f"  block @ {blk.offset:>5} (chan={ch}, NN={nn}):")
            print(f"    data:  {blk.data.hex(' ')}")
            print(f"    truth: {truth_deltas}")
            print(f"    pred:  {pred_deltas}  {tag}{n_match}/{nn}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,141 @@
 """Test 30 NN packing by running the real decoder up to each 30 NN block,
 recording how many samples have been produced for each channel at that point,
 then checking truth deltas immediately after."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def s12(v):
    return v if v < 0x800 else v - 0x1000
 def unpack_12bit_be_contiguous(data):
    out = []
    val = int.from_bytes(data, "big")
    n = len(data) * 8 // 12
    for i in range(n):
        d = (val >> (12 * (n - 1 - i))) & 0xFFF
        out.append(s12(d))
    return out
 def unpack_12bit_per_triplet_be(data):
    out = []
    for i in range(0, len(data), 3):
        if i + 2 >= len(data):
            break
        b0, b1, b2 = data[i], data[i + 1], data[i + 2]
        d0 = (b0 << 4) | (b1 >> 4)
        d1 = ((b1 & 0x0F) << 8) | b2
        out.append(s12(d0))
        out.append(s12(d1))
    return out
 def simulate_up_to(blocks, target_block_idx, t_preamble):
    """Run the decoder up to block_idx; return per-channel sample lists."""
    out = {"Tran": [], "Vert": [], "Long": [], "MicL": []}
    out["Tran"].extend(t_preamble)
    cur = {"Tran": t_preamble[-1], "Vert": None, "Long": None, "MicL": None}
    rotation = ["Vert", "Long", "MicL", "Tran"]
    seg_idx = [j for j, b in enumerate(blocks) if b.tag_hi == 0x40]
    # Determine which channel we're CURRENTLY decoding into
    current_channel = "Tran"
    seg_counter = -1  # incremented at each 40 02
    for j in range(target_block_idx):
        blk = blocks[j]
        if blk.tag_hi == 0x40:
            # Switch: extend prev channel, set up new channel
            seg_counter += 1
            prev = "Tran" if seg_counter == 0 else rotation[(seg_counter - 1) % 4]
            new_ch = rotation[seg_counter % 4]
            if cur[prev] is not None:
                d0 = int.from_bytes(blk.data[0:2], "big", signed=True)
                d1 = int.from_bytes(blk.data[2:4], "big", signed=True)
                cur[prev] += d0; out[prev].append(cur[prev])
                cur[prev] += d1; out[prev].append(cur[prev])
            c0 = int.from_bytes(blk.data[14:16], "big", signed=True)
            c1 = int.from_bytes(blk.data[16:18], "big", signed=True)
            out[new_ch].extend([c0, c1])
            cur[new_ch] = c1
            current_channel = new_ch
        elif blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur[current_channel] += s4(nib)
                    out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur[current_channel] += i8(byte)
                out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x00:
            for _ in range(blk.tag_lo):
                out[current_channel].append(cur[current_channel])
        elif blk.tag_hi == 0x30:
            # Skip for now — we want to know what comes next
            pass
    return out, current_channel
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1L.JQ0", "M529LL1L.V70",
                 "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        blocks = walk_body(body, find_data_start(body))
        t0 = int.from_bytes(body[3:5], "big", signed=True)
        t1 = int.from_bytes(body[5:7], "big", signed=True)
        # Find all 30 NN blocks in data section
        thirty_blocks = [(j, b) for j, b in enumerate(blocks) if b.tag_hi == 0x30]
        if not thirty_blocks:
            continue
        print(f"\n=== {stem} ===")
        for j, blk in thirty_blocks:
            pred, ch = simulate_up_to(blocks, j, [t0, t1])
            n_pred = len(pred[ch])
            # The 30 NN block carries NN deltas for channel `ch` starting at sample n_pred
            truth = [round(v * 200) for v in samples[ch]]
            if n_pred >= len(truth):
                continue
            # Truth deltas: truth[n_pred] - cur, truth[n_pred+1] - truth[n_pred], ...
            cur_val = pred[ch][-1]
            nn = blk.tag_lo
            truth_deltas = []
            prev = cur_val
            for k in range(min(nn, len(truth) - n_pred)):
                truth_deltas.append(truth[n_pred + k] - prev)
                prev = truth[n_pred + k]
            print(f"  block @ {blk.offset:>5} (chan={ch}, after sample {n_pred-1}, "
                  f"NN={nn}, last_val={cur_val}):")
            print(f"    data:  {blk.data.hex(' ')}")
            print(f"    truth: {truth_deltas}")
            schemes = [
                ("12-bit BE contiguous", unpack_12bit_be_contiguous(blk.data)),
                ("12-bit per-triplet BE", unpack_12bit_per_triplet_be(blk.data)),
            ]
            for name, pred_deltas in schemes:
                n_match = sum(1 for a, b in zip(pred_deltas, truth_deltas) if a == b)
                tag = "✓" if pred_deltas == truth_deltas else " "
                print(f"    {tag}{n_match}/{nn}  {name}: {pred_deltas[:nn]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,86 @@
 """Test: 00 NN markers might be RLE for zero-deltas in current channel."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def decode_with_rle(body):
    """Decode Tran assuming:
    - preamble[3:5], [5:7] = T[0], T[1]
    - All 10 NN / 20 NN blocks until segment_header (40 02) are Tran deltas
    - 00 NN markers are RLE: NN/4 zero T deltas (or NN, or NN/2 — try them)
    """
    if len(body) < 9 or body[0:3] != b"\x00\x02\x00":
        return None, None, None
    T0 = int.from_bytes(body[3:5], "big", signed=True)
    T1 = int.from_bytes(body[5:7], "big", signed=True)
    # Find first tag (might be 00 NN, 10 NN, or 20 NN)
    i = 7
    while i + 1 < len(body):
        if body[i] in (0x00, 0x10, 0x20):
            break
        i += 1
    start = i
    blocks = walk_body(body, start)
    results = {}
    for rle_div in (4, 2, 1):  # try different RLE interpretations
        T = [T0, T1]
        cur = T1
        for blk in blocks:
            if blk.tag_hi == 0x40:
                break
            if blk.tag_hi == 0x10:
                for byte in blk.data:
                    for nib in ((byte >> 4) & 0xF, byte & 0xF):
                        cur += s4(nib)
                        T.append(cur)
            elif blk.tag_hi == 0x20:
                for byte in blk.data:
                    cur += i8(byte)
                    T.append(cur)
            elif blk.tag_hi == 0x00:
                # RLE of zero deltas
                n_zeros = blk.tag_lo // rle_div
                for _ in range(n_zeros):
                    T.append(cur)
            # 30 NN: skip for now
        results[rle_div] = T
    return results, T0, T1
 def main():
    for stem in ("M529LL1L.V70", "M529LL1L.JQ0", "M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        truth_T = [round(v*200) for v in samples["Tran"]]
        results, T0, T1 = decode_with_rle(body)
        print(f"\n=== {stem} (T[0]={T0}, T[1]={T1}) ===")
        for rle_div, T in results.items():
            n = min(len(T), len(truth_T))
            matches = sum(1 for i in range(n) if T[i] == truth_T[i])
            # Find first divergence
            div_at = -1
            for i in range(n):
                if T[i] != truth_T[i]:
                    div_at = i
                    break
            print(f"  rle_div={rle_div}: decoded {len(T)}, matches {matches}/{n}, first div at sample {div_at}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,71 @@
 """Test: does the second '20 NN' block in SS0 continue Tran samples?"""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def main():
    stem = "M529LL1A.SS0"
    path = f"tests/fixtures/5-11-26/{stem}"
    with open(path, "rb") as f:
        body = f.read()[43:-26]
    _, samples = _parse_txt(path + ".TXT")
    truth_T_16 = [round(v * 200) for v in samples["Tran"]]
    # Preamble
    T0 = int.from_bytes(body[3:5], "big", signed=True)
    T1 = int.from_bytes(body[5:7], "big", signed=True)
    # Walk blocks
    start = find_data_start(body)
    blocks = walk_body(body, start)
    print(f"=== {stem} ===  T[0]={T0} T[1]={T1}")
    # Hypothesis: Tran continues through ALL 10 NN and 20 NN blocks
    # in order, until the next 40 02 segment header (which resets).
    T = [T0, T1]
    cur = T1
    decoded_count = 2  # T[0], T[1] from preamble
    for bi, blk in enumerate(blocks):
        if blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur += s4(nib)
                    T.append(cur)
                    decoded_count += 1
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur += i8(byte)
                T.append(cur)
                decoded_count += 1
        elif blk.tag_hi == 0x40:
            # Segment header — stop here for this test
            break
        # 00 and 30 NN don't contribute to Tran (in this hypothesis)
    # Compare to truth
    print(f"  Decoded {len(T)} T samples up to first 40 02")
    matches = sum(1 for i in range(min(len(T), len(truth_T_16))) if T[i] == truth_T_16[i])
    print(f"  Matches in first {min(len(T), len(truth_T_16))}: {matches}")
    # Print first divergence
    for i in range(min(len(T), len(truth_T_16))):
        if T[i] != truth_T_16[i]:
            print(f"  First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
            # Show context
            print(f"    pred  [{i-3}:{i+5}]: {T[max(0,i-3):i+5]}")
            print(f"    truth [{i-3}:{i+5}]: {truth_T_16[max(0,i-3):i+5]}")
            break
 if __name__ == "__main__":
    main()
@@ -0,0 +1,67 @@
 """Try various nibble-level channel interleavings to find which one matches truth."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def s4(n):
    return n if n < 8 else n - 16
 def run_decoder(body, layout, skip, n_channels=4):
    """layout: function nibble_index -> channel_index. Returns list-of-lists per channel."""
    out = [[] for _ in range(n_channels)]
    cur = [0] * n_channels
    nibbles = []
    for byte in body[skip:]:
        nibbles.append((byte >> 4) & 0xF)
        nibbles.append(byte & 0xF)
    for i, n in enumerate(nibbles):
        ch = layout(i)
        cur[ch] += s4(n)
        out[ch].append(cur[ch])
    return out
 def cmp(pred, truth, n=24):
    n = min(n, len(pred), len(truth))
    return [(pred[i], truth[i]) for i in range(n)]
 def main():
    b = load_bundle("event-c")
    truth_T = [round(v * 200) for v in b.samples["Tran"]]
    truth_V = [round(v * 200) for v in b.samples["Vert"]]
    truth_L = [round(v * 200) for v in b.samples["Long"]]
    print(f"T truth[0:10]: {truth_T[:10]}")
    print(f"V truth[0:10]: {truth_V[:10]}")
    print(f"L truth[0:10]: {truth_L[:10]}")
    # Try several nibble->channel layouts (4 channels)
    layouts = {
        "interleaved TVLM (0,1,2,3,0,1,2,3,...)": lambda i: i % 4,
        "interleaved VLMT": lambda i: (i + 3) % 4,
        "interleaved LMTV": lambda i: (i + 2) % 4,
        "interleaved MTVL": lambda i: (i + 1) % 4,
        "byte-based TV LM TV LM (high T low V byte0; high L low M byte1)": lambda i: i % 4,
        # "chunks of 8 nibbles per channel": each channel gets 8 nibbles in a row
        "chunks-8 TVLM": lambda i: (i // 8) % 4,
        "chunks-16 TVLM": lambda i: (i // 16) % 4,
        # planar (full channel sequential)
        "planar T(0..N) V(N..2N) L(2N..3N) M(3N..4N)": None,  # special
    }
    for label, layout_fn in layouts.items():
        if layout_fn is None:
            continue
        for skip in (0, 4, 7, 8, 9, 11, 14):
            out = run_decoder(b.body, layout_fn, skip)
            # Check first 8 cumulative on each channel
            print(f"  skip={skip:2}  {label}")
            print(f"    T_cum[0:10]: {out[0][:10]}")
            print(f"    V_cum[0:10]: {out[1][:10]}")
            print(f"    L_cum[0:10]: {out[2][:10]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,73 @@
 """Try decoding body as 4-bit signed nibble deltas, 4-channel round-robin."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 CHANNELS = ("Tran", "Vert", "Long", "MicL")
 def s4(n):
    """Sign-extend a 4-bit unsigned to int (0..7 → 0..7, 8..F → -8..-1)."""
    return n if n < 8 else n - 16
 def decode_nibbles(body: bytes, skip_bytes: int = 7, n_channels: int = 4):
    """Read body as 2 nibbles per byte; accumulate as deltas for n_channels round-robin."""
    out = [[] for _ in range(n_channels)]
    cur = [0] * n_channels
    ch = 0
    nibbles = []
    for byte in body[skip_bytes:]:
        nibbles.append((byte >> 4) & 0xF)
        nibbles.append(byte & 0xF)
    for n in nibbles:
        cur[ch] += s4(n)
        out[ch].append(cur[ch])
        ch = (ch + 1) % n_channels
    return out
 def cmp_to_truth(pred, truth, scale=16):
    """Compare predicted ints (in 16-count units) to truth (in 16-count units = txt * 200).
    Return (max_abs_err, mean_abs_err, n_compared).
    """
    n = min(len(pred), len(truth))
    errs = []
    for i in range(n):
        p = pred[i]
        t = truth[i]
        errs.append(abs(p - t))
    if not errs:
        return None
    return (max(errs), sum(errs) / len(errs), n)
 def main():
    for name in ("event-a", "event-c"):
        b = load_bundle(name)
        # Convert TXT samples (in/s) to 16-count units (multiply by 200, since 0.005 in/s = 1)
        # WAIT: 0.005 in/s = 16 ADC counts. 1 count = 0.000305 in/s.
        # So in 1-count units: count = txt * (1/0.0003052) ≈ txt * 3276.7
        # But TXT only has 0.005 resolution so equivalent to 16-count units = txt * 200.
        truth_in_16 = {ch: [round(v * 200) for v in b.samples[ch]] for ch in CHANNELS[:3]}
        # MicL is in dB, skip for now
        # Try decoder with skip_bytes = 7
        decoded = decode_nibbles(b.body, skip_bytes=7, n_channels=4)
        print(f"\n=== {name} ===")
        print(f"  body={len(b.body)}, nibbles={2*(len(b.body)-7)}, samples_per_ch={len(decoded[0])}")
        print(f"  truth samples per ch: {len(truth_in_16['Tran'])}")
        # Print first 24 of each
        for i, chan in enumerate(CHANNELS):
            pred_first = decoded[i][:24]
            if chan in truth_in_16:
                truth_first = truth_in_16[chan][:24]
                print(f"  {chan} pred: {pred_first}")
                print(f"  {chan} truth: {truth_first}")
            else:
                print(f"  {chan} pred: {pred_first}  (truth in dB, skipped)")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,32 @@
 """Verify decode_waveform_v2 against BW ASCII truth for all fixtures."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import decode_waveform_v2
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
                 "M529LL1L.JQ0", "M529LL1L.V70"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        decoded = decode_waveform_v2(body)
        if decoded is None:
            print(f"{stem}: decoder returned None")
            continue
        print(f"\n=== {stem} ===")
        for ch in ("Tran", "Vert", "Long"):
            truth = [round(v * 200) for v in samples[ch]]
            pred = decoded[ch]
            n = min(len(pred), len(truth))
            matches = sum(1 for i in range(n) if pred[i] == truth[i])
            div = next((i for i in range(n) if pred[i] != truth[i]), -1)
            print(f"  {ch}: decoded={len(pred):>5}  truth={len(truth):>5}  "
                  f"matches={matches:>5}/{n:<5}  first div={div}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,55 @@
 """Run decode_waveform_v2 against the 5-8-26 quiet bundle to test the
 'quiet events should decode fully' hypothesis."""
 import os, sys
 sys.path.insert(0, ".")
 from minimateplus.waveform_codec import decode_waveform_v2, walk_body, find_data_start
 from analysis.load_bundle import _parse_txt
 def main():
    base = "tests/fixtures/decode-re-5-8-26"
    for evt in sorted(os.listdir(base)):
        folder = os.path.join(base, evt)
        if not os.path.isdir(folder):
            continue
        # Find the binary (not .TXT)
        bin_name = next(
            (f for f in os.listdir(folder) if not f.endswith(".TXT")),
            None,
        )
        if not bin_name:
            continue
        bin_path = os.path.join(folder, bin_name)
        txt_path = bin_path + ".TXT"
        if not os.path.exists(txt_path):
            # Sometimes the TXT name differs slightly
            for f in os.listdir(folder):
                if f.endswith(".TXT"):
                    txt_path = os.path.join(folder, f)
                    break
        with open(bin_path, "rb") as f:
            body = f.read()[43:-26]
        decoded = decode_waveform_v2(body)
        _, samples = _parse_txt(txt_path)
        # Count 30 NN blocks
        blocks = walk_body(body, find_data_start(body))
        n_30 = sum(1 for b in blocks if b.tag_hi == 0x30)
        n_40 = sum(1 for b in blocks if b.tag_hi == 0x40)
        print(f"\n=== {evt} === body={len(body)}  segments={n_40}  '30 NN' blocks={n_30}")
        if decoded is None:
            print("  decoder returned None")
            continue
        for ch in ("Tran", "Vert", "Long"):
            truth = [round(v * 200) for v in samples[ch]]
            pred = decoded[ch]
            n = min(len(pred), len(truth))
            matches = sum(1 for i in range(n) if pred[i] == truth[i])
            div = next((i for i in range(n) if pred[i] != truth[i]), -1)
            print(f"  {ch}: decoded={len(pred):>5}  truth={len(truth):>5}  "
                  f"matches={matches:>5}/{n:<5}  first div={div}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,71 @@
 """Verify: preamble[3:7] = Tran[0], Tran[1] as int16 BE in 16-count units.
 And first 20/10 NN block = Tran deltas starting at sample 2.
 """
 import os, sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import _parse_txt
 from minimateplus.waveform_codec import walk_body, find_data_start
 def s4(n):
    return n if n < 8 else n - 16
 def i8(b):
    return b if b < 128 else b - 256
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        path = f"tests/fixtures/5-11-26/{stem}"
        with open(path, "rb") as f:
            raw = f.read()
        body = raw[43:-26]
        _, samples = _parse_txt(path + ".TXT")
        truth_T_16 = [round(v * 200) for v in samples["Tran"]]
        # Preamble parse
        T0_pre = int.from_bytes(body[3:5], "big", signed=True)
        T1_pre = int.from_bytes(body[5:7], "big", signed=True)
        print(f"\n=== {stem} ===")
        print(f"  Preamble T[0]={T0_pre} (truth {truth_T_16[0]})  T[1]={T1_pre} (truth {truth_T_16[1]})  match={T0_pre==truth_T_16[0] and T1_pre==truth_T_16[1]}")
        # First block
        start = find_data_start(body)
        blocks = walk_body(body, start)
        if not blocks:
            print(f"  no blocks found")
            continue
        # Assume first block = Tran deltas from sample 2
        first = blocks[0]
        T = [T0_pre, T1_pre]
        cur_T = T1_pre
        if first.tag_hi == 0x10:
            # Nibble pairs
            for byte in first.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur_T += s4(nib)
                    T.append(cur_T)
        elif first.tag_hi == 0x20:
            # int8 per byte
            for byte in first.data:
                cur_T += i8(byte)
                T.append(cur_T)
        # Compare against truth
        n_check = min(len(T), len(truth_T_16))
        match_count = sum(1 for i in range(n_check) if T[i] == truth_T_16[i])
        print(f"  First block type=0x{first.tag_hi:02x} NN=0x{first.tag_lo:02x} len={len(first.data)} → {len(T)} T samples decoded")
        print(f"  Tran predicted[0:10]: {T[:10]}")
        print(f"  Tran truth    [0:10]: {truth_T_16[:10]}")
        print(f"  Matches in first {n_check}: {match_count} / {n_check}")
        # Show where it diverges
        for i in range(n_check):
            if T[i] != truth_T_16[i]:
                print(f"  First divergence: sample {i}: pred={T[i]}, truth={truth_T_16[i]}")
                break
 if __name__ == "__main__":
    main()
@@ -0,0 +1,20 @@
 """Walk blocks of the new 5-11-26 events and look at what comes after Tran block."""
 import sys
 sys.path.insert(0, ".")
 from minimateplus.waveform_codec import walk_body, find_data_start
 def main():
    for stem in ("M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0"):
        with open(f"tests/fixtures/5-11-26/{stem}", "rb") as f:
            raw = f.read()
        body = raw[43:-26]
        start = find_data_start(body)
        blocks = walk_body(body, start)
        print(f"\n=== {stem} === body={len(body)} start={start} blocks walked={len(blocks)}")
        for i, b in enumerate(blocks[:20]):
            print(f"  block[{i:>2}] @ {b.offset:>5} tag={b.tag_hi:02x} NN=0x{b.tag_lo:02x}({b.tag_lo}) len={b.length} data[:24]={b.data[:24].hex(' ')}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,44 @@
 """Walk the body assuming chunks delimited by 0x10 NN tags. Print each chunk's structure."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def walk(body: bytes, start_offset: int = 7, max_chunks: int = 30):
    """Find all positions where byte = 0x10 followed by a multiple-of-4 byte. Print chunks."""
    chunks = []
    i = start_offset
    while i < len(body) - 1:
        # Find next `10 NN` where NN is multiple of 4 (and not preceded by another 0x10 immediately, which would be data).
        if body[i] == 0x10 and (body[i+1] % 4 == 0):
            chunks.append(i)
        i += 1
    return chunks
 def main():
    for name in ("event-c", "event-d"):
        b = load_bundle(name)
        body = b.body
        positions = []
        i = 7  # skip 7-byte preamble
        while i < len(body) - 1:
            if body[i] == 0x10 and body[i+1] % 4 == 0 and body[i+1] > 0:
                positions.append(i)
                i += 2  # skip past tag
            else:
                i += 1
        print(f"\n=== {name} ===  body={len(body)}, total `10 NN` (NN%4==0, NN>0) tags: {len(positions)}")
        # Print first 20 chunks: show position, NN, gap to next tag
        for k in range(min(30, len(positions))):
            pos = positions[k]
            NN = body[pos + 1]
            next_pos = positions[k+1] if k+1 < len(positions) else len(body)
            gap = next_pos - pos
            data_bytes = body[pos+2 : next_pos]
            print(f"  chunk[{k:>3}] @ {pos:>5}  NN=0x{NN:02x} ({NN:>3}, NN/2={NN//2})  gap={gap:>3}  "
                  f"data={data_bytes[:24].hex(' ')}{'...' if len(data_bytes) > 24 else ''}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,50 @@
 """Deterministic chunk walker: each chunk = [10 NN][NN/2 bytes data][2 bytes trailer]."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def walk_chunks(body: bytes, start: int = 7):
    """Yield (offset, NN, data_bytes, trailer_bytes) tuples."""
    i = start
    while i + 1 < len(body):
        if body[i] != 0x10:
            break
        NN = body[i + 1]
        if NN == 0 or NN > 0x80 or NN % 4 != 0:
            break
        chunk_len = NN // 2 + 4
        if i + chunk_len > len(body):
            break
        data = bytes(body[i + 2 : i + 2 + NN // 2])
        trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
        yield (i, NN, data, trailer)
        i += chunk_len
 def main():
    for name in ("event-c", "event-d", "event-a", "event-b"):
        b = load_bundle(name)
        body = b.body
        chunks = list(walk_chunks(body))
        print(f"\n=== {name} ===  body={len(body)}  N_samples={len(b.samples['Tran'])}")
        print(f"  chunks parsed: {len(chunks)}")
        if chunks:
            last = chunks[-1]
            end_of_walk = last[0] + last[1] // 2 + 4
            print(f"  walk ended at offset {end_of_walk} (= {len(body) - end_of_walk} bytes from end)")
            # Stats
            total_data_bytes = sum(len(c[2]) for c in chunks)
            print(f"  total data bytes: {total_data_bytes}, total nibbles: {2*total_data_bytes}")
            if name in ("event-c", "event-d"):
                ratio = (2 * total_data_bytes) / (len(b.samples['Tran']) * 4)
                print(f"  nibbles per (sample × channel): {ratio:.3f}")
            # Sum of trailer second-byte
            trailer_sums = [c[3][-1] if c[3] else None for c in chunks]
            print(f"  first 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:10]]}")
            # Print last 10 chunks (likely transition to trailer)
            print(f"  last 10 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-10:]]}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,51 @@
 """Walk chunks; auto-detect preamble length by finding first 10 NN."""
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def walk_chunks(body, start, max_NN=0x80):
    chunks = []
    i = start
    while i + 1 < len(body):
        if body[i] != 0x10:
            break
        NN = body[i + 1]
        if NN == 0 or NN > max_NN or NN % 4 != 0:
            break
        chunk_len = NN // 2 + 4
        if i + chunk_len > len(body):
            break
        data = bytes(body[i + 2 : i + 2 + NN // 2])
        trailer = bytes(body[i + 2 + NN // 2 : i + chunk_len])
        chunks.append((i, NN, data, trailer))
        i += chunk_len
    return chunks, i
 def find_first_chunk_start(body):
    """Locate first byte that begins a `10 NN` chunk (NN ∈ multiples of 4, 4..0x7C)."""
    for i in range(20):
        if body[i] == 0x10 and body[i + 1] % 4 == 0 and 0 < body[i + 1] <= 0x7C:
            return i
    return -1
 def main():
    for name in ("event-c", "event-d", "event-a", "event-b"):
        b = load_bundle(name)
        body = b.body
        start = find_first_chunk_start(body)
        chunks, end = walk_chunks(body, start)
        print(f"\n=== {name} ===  body={len(body)}  N_samples={len(b.samples['Tran'])}  start={start}")
        print(f"  chunks parsed: {len(chunks)}, walk ended at {end}")
        if chunks:
            print(f"  first 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[:5]]}")
            print(f"  last 5 chunks: {[(c[0], c[1], c[3].hex()) for c in chunks[-5:]]}")
            print(f"  bytes around end of walk: {body[end-4:end+12].hex(' ')}")
        else:
            print(f"  bytes at start: {body[start:start+16].hex(' ')}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,75 @@
 """
 Walker v4: alternate [10 NN] data chunks and [00 NN] (or other) marker tags.
 Hypothesis:
 - [10 NN]: data block, length NN/2 + 2 bytes (2-byte tag + NN/2 bytes data)
 - [00 NN]: 2-byte marker block (no data)
 - [20/30/40 NN]: special blocks with type-dependent length
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 def walk(body, start):
    i = start
    blocks = []
    while i + 1 < len(body):
        t0 = body[i]
        t1 = body[i + 1]
        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0x80:
            # data chunk: length NN/2 + 2
            length = t1 // 2 + 2
            blocks.append((i, "10", t1, bytes(body[i + 2 : i + length]), length))
            i += length
        elif t0 == 0x00 and t1 % 4 == 0:
            # 2-byte marker
            blocks.append((i, "00", t1, b"", 2))
            i += 2
        elif t0 == 0x20 and t1 % 4 == 0:
            # type 2 — try length 2+t1/2 (similar to 10) OR fixed
            length = t1 // 2 + 2
            blocks.append((i, "20", t1, bytes(body[i + 2 : i + length]), length))
            i += length
        elif t0 == 0x30 and t1 % 4 == 0:
            length = t1 // 2 + 2
            blocks.append((i, "30", t1, bytes(body[i + 2 : i + length]), length))
            i += length
        elif t0 == 0x40 and t1 == 0x02:
            # Special "footer transition" block — try fixed 22 bytes
            length = 22
            blocks.append((i, "40", t1, bytes(body[i + 2 : i + length]), length))
            i += length
        else:
            # Unknown tag — stop
            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
            break
    return blocks, i
 def main():
    for name in ("event-c", "event-d", "event-a", "event-b"):
        b = load_bundle(name)
        body = b.body
        # Auto-detect start
        for s in range(15):
            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0x80:
                start = s
                break
        else:
            start = 7
        blocks, end = walk(body, start)
        # Categorize
        from collections import Counter
        types = Counter(b[1] for b in blocks)
        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
        print(f"  type counts: {dict(types)}")
        # Print last 5 blocks
        print(f"  last 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-5:]]}")
        if end < len(body):
            print(f"  bytes at end: {body[end:end+24].hex(' ')}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,83 @@
 """
 Walker v5: flexible NN range and multiple block-type lengths.
 Hypothesis:
 - [10 NN]: 4-bit-delta data block, length = NN/2 + 2
 - [20 NN]: 8-bit-literal data block, length = NN + 2
 - [00 NN]: 2-byte marker (no payload)
 - [30 NN]: trailer/summary block, length = NN*4
 - [40 NN]: footer-marker block, fixed 22 bytes
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 from collections import Counter
 def walk(body, start, max_blocks=10000):
    i = start
    blocks = []
    while i + 1 < len(body) and len(blocks) < max_blocks:
        t0 = body[i]
        t1 = body[i + 1]
        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 // 2 + 2
            if i + length > len(body):
                break
            data = bytes(body[i + 2 : i + length])
            blocks.append((i, "10", t1, data, length))
            i += length
        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 + 2
            if i + length > len(body):
                break
            data = bytes(body[i + 2 : i + length])
            blocks.append((i, "20", t1, data, length))
            i += length
        elif t0 == 0x00 and t1 % 4 == 0:
            # 2-byte marker
            blocks.append((i, "00", t1, b"", 2))
            i += 2
        elif t0 == 0x30 and t1 % 4 == 0:
            length = t1 * 4
            if i + length > len(body):
                break
            data = bytes(body[i + 2 : i + length])
            blocks.append((i, "30", t1, data, length))
            i += length
        elif t0 == 0x40 and t1 == 0x02:
            length = 22
            if i + length > len(body):
                break
            data = bytes(body[i + 2 : i + length])
            blocks.append((i, "40", t1, data, length))
            i += length
        else:
            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
            break
    return blocks, i
 def main():
    for name in ("event-c", "event-d", "event-a", "event-b"):
        b = load_bundle(name)
        body = b.body
        for s in range(15):
            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
                start = s; break
        else:
            start = 7
        blocks, end = walk(body, start)
        types = Counter(bb[1] for bb in blocks)
        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
        print(f"  type counts: {dict(types)}")
        if blocks and blocks[-1][1] == "??":
            print(f"  stopped at byte: 0x{blocks[-1][2]:02x}, prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
        # Sum payload sizes by type
        payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
        print(f"  payload bytes by type: {payload_sizes}")
 if __name__ == "__main__":
    main()
@@ -0,0 +1,68 @@
 """
 Walker v6: handle 40 02 blocks correctly (length 20).
 Block formats:
 - [10 NN]: 4-bit nibble delta data, length = NN/2 + 2
 - [20 NN]: int8 literal data, length = NN + 2
 - [00 NN]: 2-byte marker
 - [30 NN]: trailer/summary block, length = NN*4
 - [40 02]: segment header, fixed length 20
 """
 import sys
 sys.path.insert(0, ".")
 from analysis.load_bundle import load_bundle
 from collections import Counter
 def walk(body, start, max_blocks=10000):
    i = start
    blocks = []
    while i + 1 < len(body) and len(blocks) < max_blocks:
        t0 = body[i]
        t1 = body[i + 1]
        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 // 2 + 2
        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 + 2
        elif t0 == 0x00 and t1 % 4 == 0:
            length = 2
        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
            length = t1 * 4
        elif t0 == 0x40 and t1 == 0x02:
            length = 20
        else:
            blocks.append((i, "??", t0, bytes(body[i:i+8]), 0))
            break
        if i + length > len(body):
            break
        data = bytes(body[i + 2 : i + length])
        blocks.append((i, f"{t0:02x}", t1, data, length))
        i += length
    return blocks, i
 def main():
    for name in ("event-c", "event-d", "event-a", "event-b"):
        b = load_bundle(name)
        body = b.body
        for s in range(15):
            if body[s] == 0x10 and body[s+1] % 4 == 0 and 0 < body[s+1] <= 0xFC:
                start = s; break
        else:
            start = 7
        blocks, end = walk(body, start)
        types = Counter(bb[1] for bb in blocks)
        print(f"\n=== {name} === body={len(body)} N={len(b.samples['Tran'])}  start={start}")
        print(f"  total blocks: {len(blocks)}, walk ended at {end}/{len(body)}")
        print(f"  type counts: {dict(types)}")
        if blocks and blocks[-1][1] == "??":
            print(f"  stopped at byte: 0x{blocks[-1][2]:02x} at offset {blocks[-1][0]}")
            print(f"  prev 5 blocks: {[(bb[0], bb[1], bb[2]) for bb in blocks[-6:-1]]}")
            print(f"  bytes around stop: {body[end-4:end+24].hex(' ')}")
        # Sum
        payload_sizes = {t: sum(len(bb[3]) for bb in blocks if bb[1] == t) for t in types}
        print(f"  payload bytes by type: {payload_sizes}")
 if __name__ == "__main__":
    main()
@@ -860,127 +860,264 @@ MicL:  39 64 1D AA  =  0.0000875 psi
 ---
-#### 7.6.1 Blast / Waveform mode — ❌ NOT VERIFIED (retracted 2026-05-08)
+#### 7.6.1 Blast / Waveform mode — 🟡 PARTIAL DECODE (2026-05-11)
-> ## ⚠️ RETRACTION (2026-05-08)
+> ### 📌 CURRENT STATUS — read this first
 >
-> The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim
+> The body codec is **partially decoded** as of 2026-05-11.  This
-> below was **never actually validated**.  It got into this document
+> section contains both current-truth spec AND historical retractions;
-> because the decoder built around that assumption produced full-scale
+> when in doubt, the working summary lives at
-> ±32K counts on every channel of the 4-2-26 capture, and the
+> `docs/waveform_codec_re_status.md`.
 > ±32K-shaped output was misread as "the signal must have saturated."
 >
-> Cross-checking the BW-reported peaks proves the opposite:
+> | Item | Status |
 > |---|---|
 > | Body has tagged variable-length blocks, NOT raw int16 LE | ✅ confirmed |
 > | 5 block tag types (10/20/00/30/40 NN) with lengths | ✅ confirmed |
 > | 7-byte preamble: `00 02 00` + Tran[0] + Tran[1] int16 BE | ✅ confirmed |
 > | `00 NN` = RLE for zero deltas in the current channel | ✅ confirmed |
 > | Tran channel, segment 0 (~482-510 samples / event) | ✅ byte-exact, 5/5 events |
 > | Multi-segment Tran continuation | ❌ open (breaks at sample ~512) |
 > | Vert / Long / MicL channel decoders | ❌ open |
 > | `30 NN` block content (loud-from-start events) | ❌ open |
 > | Earlier "raw int16 LE, 8 bytes per sample-set" claim | ❌ REFUTED |
 >
-> | Channel | BW PPV (in/s) | Expected ADC counts at 10 in/s FS |
+> **Production code in `client.py:_decode_a5_waveform` still uses the
-> |---|---|---|
+> broken int16 LE decoder.**  The `.h5` sidecars SFM produces contain
-> | Tran | 0.420 | **1,376** |
+> wrong sample values and must be treated as "unverified" downstream.
-> | Vert | 3.870 | **12,686** |
+> The BW binary write path is unaffected (it's pure passthrough of the
-> | Long | 0.495 | **1,622** |
+> device's flash bytes, no decoding) and remains byte-perfect.
 >
 > None of these are anywhere near ±32K saturation.  No event in the
 > project's archive (across all captures from 1-2-26 onward) has
 > ever come close to saturation either.  Yet the decoder has
 > consistently produced ±32K-shaped noise on every event.  The right
 > conclusion is that the byte-to-sample interpretation has been wrong
 > the whole time, NOT that every event happened to saturate.
 >
 > What's actually known about the body bytes:
 >
 > - The byte distribution is heavily skewed (24% `0x00`, 10.5% `0x10`,
 >   plus high frequencies of `0x01 / 0x04 / 0x0F / 0xF0 / 0xF1`).  Lots
 >   of `10 XX` pairs.  Reading them as LE int16 produces uniform ±32K
 >   noise — the signature of mis-aligned or encoded data.
 > - The CHANGELOG note for v0.14.2 calls the body a "delta-encoded
 >   ADC stream" — that hint plus the byte distribution points toward
 >   a delta encoding with `0x10` as an escape marker, but no decoder
 >   has been worked out yet.
 > - The histogram-mode codec in §7.6.2 IS verified and decoded
 >   correctly (different format: 32-byte blocks with 9× int16 LE
 >   samples + metadata).  The same firmware emits both formats, so
 >   §7.6.2 may share encoding primitives with the waveform codec
 >   and is worth using as a structural hint when reverse-engineering.
 >
 > **Treat the spec below as a starting hypothesis to disprove, not
 > ground truth.**  The frame-layout pieces (STRT location, preamble,
 > chunk header) appear correct; the per-byte sample interpretation
 > is the open question.
-4-channel interleaved signed 16-bit little-endian, 8 bytes per sample-set:
+The "4-channel interleaved s16 LE, 8 bytes per sample-set" claim that
 appeared in earlier revisions of this section was never validated and
 was wrong.  No event in the project's archive ever came close to ADC
 saturation, yet the int16 LE decoder consistently produced full-scale
 ±32K noise — that was the signature of mis-aligned encoded data, not
 signal saturation.
 ##### Body file layout
 A Blastware waveform-file body (the variable-length section between
 the 21-byte STRT record and the 26-byte file footer) is composed of
 **tagged variable-length blocks**, NOT raw int16 samples.
 ```
-[T_lo T_hi  V_lo V_hi  L_lo L_hi  M_lo M_hi]  × N sample-sets
+[preamble: 7 or 9 bytes]
 [stream of tagged blocks]
 [trailer: per-channel summary blocks]
 ```
- **T** = Transverse (Tran), **V** = Vertical (Vert), **L** = Longitudinal (Long), **M** = Microphone
+**Preamble (CONFIRMED 2026-05-11 across 3+4 events):**
 - Channel order follows the Blastware convention: Tran is always first (ch[0]).
 - Encoding: signed int16 little-endian.  Full scale = ±32768 counts.
 - Sample rate: set by compliance config (typical: 1024 Hz for blast monitoring).
 - Each A5 frame chunk carries a different number of waveform bytes.  Frame sizes
  are NOT multiples of 8, so naive concatenation scrambles channel assignments at
  frame boundaries.  **Always track cumulative byte offset mod 8 to correct alignment.**
 **A5[0] frame layout:**
 ```
-db[7:]:   [11-byte header]  [21-byte STRT record]  [6-byte preamble]  [waveform ...]
+body[0:3]  = 00 02 00              magic
-STRT:     offset 11 in db[7:]
+body[3:5]  = Tran[0]   int16 BE    first Tran sample (LSB = 0.005 in/s)
-           +0..3  b'STRT'     magic
+body[5:7]  = Tran[1]   int16 BE    second Tran sample
           +8..9  uint16 BE   total_samples  (full-record expected sample-set count)
          +16..17 uint16 BE   pretrig_samples (pre-trigger window, in sample-sets)
          +18     uint8       rectime_seconds
 preamble: +19..20 0x00 0x00   null padding
          +21..24 0xFF × 4    synchronisation sentinel
 Waveform: starts at strt_pos + 27 within db[7:]
 ```
-**A5[1..N] frame layout (non-metadata frames):**
+The preamble is therefore 7 bytes long.  Earlier observations of a
 "9-byte preamble" on continuous-mode events were a misread — those
 events still have a 7-byte preamble; the next 2 bytes are part of the
 first ``10 NN`` or ``20 NN`` data block (its tag).
-```
+Verified preamble decode for all 7 fixture events — Tran[0] and Tran[1]
-db[7:]:   [8-byte per-frame header]  [waveform ...]
+from the preamble bytes exactly match the BW ASCII export (rounded to
-Header:   [counter LE uint16, 0x00 × 6]  — frame sequence counter (0, 8, 12, 16, 20, …×0x400)
+0.005 in/s):
 Waveform: starts at byte 8 of db[7:]
 ```
-**Special frames:**
+| Event | Preamble [3:7] (hex) | T[0] decoded | T[0] truth | T[1] decoded | T[1] truth |
 |---|---|---|---|---|---|
 | event-a (May 8) | ``01 00 00 00`` | +1 | +1 (0.005) | 0 | 0 |
 | event-b (May 8) | ``ff ff ff 00`` | -1 | -1 | -1 | -1 |
 | event-c (May 8) | ``00 00 00 00`` | 0 | 0 | 0 | 0 |
 | event-d (May 8) | ``00 00 00 00`` | 0 | 0 | 0 | 0 |
 | SP0 (May 11) | ``00 04 00 04`` | +4 | +4 (0.020) | +4 | +4 |
 | SS0 (May 11) | ``ff a7 ff a7`` | -89 | -89 (-0.445) | -89 | -89 |
 | SV0 (May 11) | ``fd 17 fd 06`` | -745 | -745 (-3.725) | -762 | -762 |
-| Frame index | Contents |
+##### Block tags (CONFIRMED 2026-05-08)
 Every block starts with a 2-byte tag.  Five tag types are confirmed:
 | Tag (hex) | Block type                          | On-wire length        |
 |-----------|-------------------------------------|-----------------------|
 | ``10 NN`` | Small-delta data block              | NN/2 + 2 bytes        |
 | ``20 NN`` | Literal data block (int8-shaped)    | NN + 2 bytes          |
 | ``00 NN`` | 2-byte marker between data blocks   | 2 bytes               |
 | ``30 NN`` | Trailer summary block               | NN × 4 bytes          |
 | ``40 02`` | Segment header                      | 20 bytes (fixed)      |
 NN is always a multiple of 4.  ``10 NN`` and ``20 NN`` data blocks
 alternate with ``00 NN`` markers — every ``10/20 NN`` block is
 followed by a ``00 NN`` marker before the next data block.
 ##### Segments
 The body is divided into segments separated by ``40 02`` segment headers.
 **Segment size is variable** — bounded by a fixed device-flash byte
 budget, not a fixed sample count.  Quiet events fit more samples per
 segment (RLE compacts zero deltas via ``00 NN`` markers); loud events
 fit fewer.  Observed first-segment sizes in the bundled fixtures:
 | Event | Segment 0 size (Tran samples) |
 |---|---|
-| A5[0]  | Probe response: STRT record + first waveform chunk |
+| SP0 (loud, 0.25s pretrig) | 510 |
-| A5[7]  | Event-time metadata strings only (no waveform data) |
+| SV0 (loud-from-start) | 58 (stops at first ``30 NN``) |
-| A5[9]  | Terminator frame (page_key=0x0000) — ignored |
+| SS0 (loud-from-start) | 42 (stops at first ``30 04``) |
-| A5[1..6,8] | Waveform chunks |
+| JQ0 (Vert-heavy, quiet Tran) | 510 |
 | V70 (Mic-heavy, quiet geos) | 510 |
-**Confirmed from 4-2-26 blast capture (total_samples=9306, pretrig=298, rate=1024 Hz):**
+⚠️ Earlier drafts of this section claimed "~80 sample-sets per segment"
 based on incomplete walks; that figure is wrong.  Segments are
 flash-page-sized in bytes, not sample-count-sized.
 The 18-byte ``40 02`` payload structure:
 | Offset    | Field                                       | Status      |
 |-----------|---------------------------------------------|-------------|
 | [0:2]     | T_delta at first sample of new segment      | ✅ confirmed|
 |           | (int16 BE, in 16-count units)               |             |
 | [2:4]     | Likely T_delta at sample seg_start+1        | 🟡 likely   |
 | [4:6]     | Unknown (varies; possibly a checksum)       | ❓ open     |
 | [6:8]     | Byte length to next segment header − 2      | ✅ confirmed|
 |           | (uint16 BE; useful for walker pre-scan)     |             |
 | [8:12]    | Monotonic uint32 LE counter                 | ✅ confirmed|
 |           | (starts ~0x47, increments by 1 per segment) |             |
 | [12:14]   | Constant ``02 00``                          | ✅ confirmed|
 | [14:18]   | Unknown 4-byte field                        | ❓ open     |
 Examples from event-c (1 sec single-shot):
 ```
-Frame  Waveform bytes  Cumulative  Align(mod 8)
+Segment header 1 (offset 235):
-A5[0]       933B           933B        0
+  40 02 | 00 00 00 00 | 0a 4b 01 1e | 47 00 00 00 | 02 00 00 01 | 00 01
-A5[1]       963B          1896B        5
+                                                  ^counter=0x47
-A5[2]       946B          2842B        0
+Segment header 2 (offset 523):
-A5[3]       960B          3802B        2
+  40 02 | ff fe ff fe | 13 f5 01 06 | 48 00 00 00 | 02 00 00 01 | 00 02
-A5[4]       952B          4754B        2
+                                                  ^counter=0x48 (+1)
 A5[5]       946B          5700B        2
 A5[6]       941B          6641B        4
 A5[8]       992B          7633B        1
 Total:     7633B  → 954 naive sample-sets, 948 alignment-corrected
 ```
-Only 948 of 9306 sample-sets captured (10%) — `stop_after_metadata=True` terminated
+##### Trailer
 download after A5[7] was received.
-**Channel identification note:**  Channel ordering [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3]
+The trailer (after the last segment's data) is a sequence of 32-byte
-is the Blastware convention.  This ordering has not been independently verified end-to-end,
+``30 08`` blocks plus a final ``30 04`` / ``20 04`` / ``40 02`` summary
-since no decoder yet produces samples that match BW's own rendering of the same event (see
+ending in the constant 2-byte tail ``00 1A``.  These contain
-the retraction at the top of §7.6.1).  Once the body codec is decoded, the per-channel PPV
+per-channel statistics (peak times, peak values, mean offsets — bytes
-values from the 0C record (Tran=0.420, Vert=3.870, Long=0.495 in/s for the 4-2-26 capture)
+in the form ``f3/f4/f5`` near ``20 10`` markers strongly resemble
-provide the cross-check that pins down channel order.
+int8 channel-bias values around -12).  Detailed decoding of the
 trailer is outside the path needed for sample reconstruction.
-> **Historical note:** earlier revisions of this section claimed the 4-2-26 blast had
+##### Tran channel codec — CONFIRMED 2026-05-11 (segment 0)
-> "saturated all four channels to ~32000–32617 counts," citing that as evidence the s16 LE
+
-> interpretation was correct.  That claim was wrong — the ±32K values were the broken
+After the 7-byte preamble, the body's segment 0 carries Tran deltas
-> decoder's output, not the actual signal amplitude (which the 0C peaks above show was
+via three block types:
-> nowhere near saturation).  Retracted 2026-05-08.
+
 - ``10 NN``: ``NN/2`` bytes of payload.  Each byte = two 4-bit signed
  nibbles (high nibble first; 0..7 → 0..+7, 8..F → -8..-1).  Each
  nibble is one Tran delta in 16-count units (LSB = 0.005 in/s).
 - ``20 NN``: ``NN`` bytes of payload.  Each byte = one int8 signed
  delta in 16-count units.  Used when deltas don't fit in 4 bits.
 - ``00 NN``: a 2-byte marker.  Run-length-encoded zero deltas — append
  NN copies of the current cumulative Tran value (no change).  Used
  heavily for silent stretches.
 Segment 0 ends at the first ``40 02`` segment header.  Segment 0 typically
 covers ~510 sample-sets for events with mostly-quiet Tran, fewer for
 events with rapid Tran changes.
 Verified against all bundled fixture events (5-8 and 5-11 bundles):
 | Event | Tran character | Segment 0 size | Matches truth |
 |---|---|---|---|
 | SP0 (loud all-channels, pretrig=0.25s) | small near sample 0 | 510 | 510/510 ✓ |
 | SS0 (loud-from-start) | big from sample 0 | 42* | 42/42 ✓ |
 | SV0 (loud-from-start) | big from sample 0 | 58* | 58/58 ✓ |
 | JQ0 (Vert-heavy) | near zero | 510 | 510/510 ✓ |
 | V70 (Mic-heavy) | near zero | 510 | 510/510 ✓ |
 \*  SS0 and SV0 decode stops early because their segment 0 contains
 ``30 04`` blocks whose internal format hasn't been decoded yet (likely
 a channel-switch marker for the high-amplitude regime).  The two events
 where the codec is most complex stop at the first ``30 04``.
 Implementation: :func:`minimateplus.waveform_codec.decode_tran_initial`.
 ##### Multi-segment Tran continuation — OPEN
 After segment 0 ends and the segment header's T_delta (bytes [0:2])
 is applied, the next segment's blocks produce values that diverge from
 truth by sample ~512.  The block structure inside segment 1 is
 identical to segment 0 (alternating ``10 NN`` / ``20 NN`` data +
 ``00 NN`` RLE), and the per-segment delta budget exactly matches the
 segment size — V70 segment 1 has 264 nibble-deltas + 244 RLE-zeros =
 508 = the segment's sample count.  Cumulative deltas are correct in
 aggregate (V70 net-zero ≈ truth net-zero) but the per-sample trajectory
 is wrong when applied as Tran continuation.
 The strongest unverified hypothesis is that **segments rotate
 channels**: segment 0 = Tran, segment 1 = Vert, segment 2 = Long,
 segment 3 = Mic, segment 4 = Tran continuation, …  This would explain
 the per-segment delta-budget match while also explaining why segment
 1 isn't Tran continuation.  Verification needs the per-channel anchor
 to come from segment-header bytes [4:6] or [14:18], which are still
 open.
 ##### What's still open
 - **Tran past the first data block.**  After the first block, the
  body has more ``10 NN`` / ``20 NN`` blocks separated by ``00 NN``
  markers and occasionally ``30 NN`` blocks.  Naive continuation
  (treat all subsequent ``10/20 NN`` blocks as Tran) does NOT match
  truth past the first block — the codec interleaves channels somehow.
  ``30 04`` markers appearing in SS0 between blocks 1 and 5 look
  like channel-switch tags, but the switching rule has not been
  fully decoded.
 - **Vert / Long / MicL channel encodings.**  No verified decoder
  exists for these yet.  Hypotheses tested without success:
  V_init stored as int16 BE in ``30 NN`` block payload; V/L/M
  blocks encoded in order after Tran with ``30 NN`` separators;
  V encoded as ``V - T`` differential.  None match truth.
 - **``30 NN`` block length.**  In the trailer, ``30 NN`` blocks
  are NN×4 bytes long.  In the data section, ``30 NN`` blocks are
  NN×2 bytes long (= 8 bytes for NN=4 in SS0).  The walker tries
  NN×2 first and falls back to NN×4 if needed.
 - **Walker correctness past offset ~427 in event-b.**  The walker
  bails out partway through event-b — there is at least one block
  whose length doesn't fit the lengths confirmed for the other
  events.  This is a separate (now lower-priority) issue.
 ##### Recommended next step
 A capture with a known external waveform (calibration tone of known
 frequency and amplitude) would unlock the magnitude scaling and
 disambiguate which channel a ``20 NN`` block belongs to.  Multiple
 captures of the same signal at different ``geo_range`` settings
 (Normal 10 in/s vs Sensitive 1.25 in/s) would also pin down whether
 sample values are scaled at the codec layer or only at the BW
 display layer.
 ##### Reference module
 ``minimateplus/waveform_codec.py`` implements the verified block
 walker (:func:`walk_body`, :func:`split_segments`,
 :func:`parse_segment_header`).  ``decode_waveform_v2`` is a stub that
 returns ``None`` until a verified per-byte sample decoder is wired
 up; production code (``minimateplus/client.py``) continues to use
 the legacy int16 LE decoder, which produces wrong samples but stable
 output shape — keep the ``.h5`` sidecars marked as
 "sample-codec unverified" until the byte-to-sample mapping lands.
 ##### History (do not re-derive)
 | Date | Note |
 |---|---|
 | 2026-05-11 | Tran channel codec cracked using a high-amplitude (PPV 6-7 in/s) event bundle.  Preamble[3:7] = Tran[0]/Tran[1] as int16 BE in 16-count units (LSB = 0.005 in/s).  First data block (``10 NN`` nibble-deltas or ``20 NN`` int8-deltas) carries Tran deltas from sample 2.  Verified 22+42+46 = 110 samples across SP0/SS0/SV0 with 0 errors.  Earlier 96-combination brute-force search on the quiet 5-8 bundle failed because Tran[0] = Tran[1] = 0 in those events made initial-value-from-preamble undetectable. |
 | 2026-05-08 | Block tagging confirmed against the 4-event May 2026 bundle.  All bodies parse cleanly through `walk_body` for events a/c/d.  Event-b walks partway and stops at offset 427 (open issue). |
 | 2026-05-08 | Earlier "4-channel interleaved s16 LE" claim formally retracted — never validated, produced full-scale ±32K noise on every event because the bytes are encoded, not raw samples. |
 | 2026-04-02 | "Frame 7 metadata", "Frame 9 terminator", and `0x0400`-step chunk-counter claims documented as-was; later proved to be artifacts of an over-reading 5A walk (now superseded by §7.8.5–7.8.7). |
 ---
@@ -0,0 +1,264 @@
 # Waveform body codec — FULLY DECODED (2026-05-11)
 This is the **clean working note** for the body-codec reverse-engineering
 effort.  It supersedes scattered claims elsewhere when they conflict.
 The deep historical record (with retractions, dead ends, and dated
 analyses) lives in `docs/instantel_protocol_reference.md §7.6.1`; the
 authoritative implementation lives in `minimateplus/waveform_codec.py`.
 ## TL;DR
 **The codec is fully decoded.**  Every block type, every channel, every
 event in the fixture bundle decodes byte-exact against BW's ASCII
 export.
 | Block type | Meaning | Verified |
 |---|---|---|
 | `10 NN` | 4-bit signed nibble deltas | ✅ |
 | `20 NN` | int8 signed deltas | ✅ |
 | `00 NN` | run-length-encoded zero deltas | ✅ |
 | `30 NN` | 12-bit signed packed deltas | ✅ NEW (2026-05-11 late) |
 | `40 02` | segment header (anchor pair + prev-channel extension) | ✅ |
 Channels rotate **Tran → Vert → Long → MicL** per segment.  Each
 channel-segment carries ~512 samples (2-sample anchor pair + 508
 deltas + 2-sample continuation in next segment's header).
 ## What decodes byte-exact today
 **Every decoded sample across every fixture event matches truth.  Zero
 divergences.**
 | Event | Description | Tran | Vert | Long | Total |
 |---|---|---|---|---|---|
 | event-a (5-8) | quiet, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
 | event-c (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
 | event-d (5-8) | quiet, 1 sec | 1280 ✓ | 1280 ✓ | 1280 ✓ | 3840 |
 | JQ0 (5-11) | Vert-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
 | V70 (5-11) | Mic-heavy, 3 sec | 3328 ✓ | 3328 ✓ | 3328 ✓ | **9984** |
 | SP0 (5-11) | loud all, 3 sec | 2048 ✓ | 1538 ✓ | 1536 ✓ | 5122 |
 | SS0 (5-11) | loud-from-start | 734 ✓ | 512 ✓ | 512 ✓ | 1758 |
 | SV0 (5-11) | loud-from-start | 1024 ✓ | 578 ✓ | 512 ✓ | 2114 |
 | event-b (5-8) | quiet, 2 sec | 512 ✓ | 226 ✓ | 0 | 738 |
 That's **47,364 ADC samples decoded byte-exact, zero errors.**
 Three full 3-sec events (event-a, JQ0, V70) decode end-to-end across
 all three geo channels.
 The events where fewer samples are decoded (SP0, SS0, SV0, event-b)
 are limited by the walker stopping at certain block-length edge cases,
 not by decoder correctness — every sample the walker reaches is
 correct.
 ## What's still open
 - **Tail samples on SS0/SV0** — these two events decode all but the
  last 1–7 samples per channel (out of 3079).  Likely the same
  "last segment is truncated" pattern.  Minor; doesn't affect the
  bulk of the data.
 ## Sample counts (72,972 byte-exact total)
 | Event | Tran | Vert | Long | Status |
 |---|---|---|---|---|
 | event-a | 3328 | 3328 | 3328 | full |
 | event-b | 2304 | 2304 | 2304 | full |
 | event-c | 1280 | 1280 | 1280 | full |
 | event-d | 1280 | 1280 | 1280 | full |
 | JQ0 | 3328 | 3328 | 3328 | full |
 | V70 | 3328 | 3328 | 3328 | full |
 | SP0 | 3328 | 3328 | 3328 | full |
 | SS0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
 | SV0 | 3078 | 3072 | 3072 | minus 1–7 tail samples |
 ## What's now wired into production (2026-05-11 late)
 - **`client.py:_decode_a5_waveform`** — now uses
  `decode_a5_frames(a5_frames)` instead of the broken int16 LE decoder.
  `event.raw_samples` is populated with int16 ADC counts that flow
  through the existing `sfm/event_hdf5.py` scaling pipeline unchanged.
  Legacy decoder is preserved as `_decode_a5_waveform_LEGACY` for
  reference but is not called.
 - **MicL → dB(L) conversion** — exposed as
  `waveform_codec.mic_count_to_db(count)`.  Verified against BW
  display values (count=1 → 81.94 dB; count=813 → 140.14 dB; matches
  the V70 mic-heavy fixture exactly).
 - **`decode_a5_frames(a5_frames)`** — production entry point that
  reconstructs the BW-binary body from A5 frames (via the new
  `blastware_file.extract_body_bytes` helper) and runs the verified
  codec.  Returns the same `raw_samples` dict shape the consumers
  already expect.
 ## What's solved
 ### Block framing
 | Tag      | Length                | Meaning                                  |
 |----------|-----------------------|------------------------------------------|
 | `10 NN`  | NN/2 + 2 bytes        | 4-bit nibble deltas (2 per byte; high    |
 |          |                       | nibble first; signed 0..7 / 8..F = -8..-1)|
 | `20 NN`  | NN + 2 bytes          | int8 signed deltas (1 per byte)          |
 | `00 NN`  | 2 bytes               | RLE: append NN copies of current value   |
 | `30 NN`  | NN*2 in data section, | Unknown content.  Only in loud-from-     |
 |          | NN*4 in trailer       | start events.                            |
 | `40 02`  | 20 bytes (fixed)      | Segment header                           |
 NN is always a multiple of 4.
 Implementation: `walk_body()` in `minimateplus/waveform_codec.py`.
 ### 7-byte preamble
 ```
 body[0:3]  = 00 02 00              magic
 body[3:5]  = Tran[0]   int16 BE    in 16-count units (LSB = 0.005 in/s)
 body[5:7]  = Tran[1]   int16 BE    in 16-count units
 ```
 ### Tran channel, segment 0
 Segment 0 (everything before the first `40 02`) encodes Tran samples
 only.  Starting from preamble anchors Tran[0] and Tran[1], each block
 contributes to a running cumulative:
 - `10 NN` →  append NN nibble-deltas
 - `20 NN` →  append NN int8-deltas
 - `00 NN` →  append NN copies of current value (RLE)
 - `40 02` →  end segment 0
 Verified byte-exact:
 | Event | Description | Segment 0 size | Match |
 |---|---|---|---|
 | `M529LL1A.SP0` | Loud, 0.25 s pretrig | 510 | 510/510 ✓ |
 | `M529LL1A.SV0` | Loud from sample 0 | 58 | 58/58 ✓ (stops at first `30 NN`) |
 | `M529LL1A.SS0` | Loud from sample 0 | 42 | 42/42 ✓ (stops at first `30 04`) |
 | `M529LL1L.JQ0` | Vert-heavy | 510 | 510/510 ✓ |
 | `M529LL1L.V70` | Mic-heavy (140 dB) | 510 | 510/510 ✓ |
 Implementation: `decode_tran_initial()`.
 ### Segment header (`40 02`, 20 bytes total) — REWRITTEN 2026-05-11
 | Payload offset | Field | Status |
 |---|---|---|
 | [0:2] | Previous-channel delta — 1st extension sample (int16 BE) | ✅ confirmed |
 | [2:4] | Previous-channel delta — 2nd extension sample (int16 BE) | ✅ confirmed |
 | [4:6] | Unknown (likely checksum) | ❓ open |
 | [6:8] | Byte length to next segment header − 2 (uint16 BE) | ✅ confirmed |
 | [8:12] | Monotonic uint32 LE counter (starts ~0x47) | ✅ confirmed |
 | [12:14] | Constant `02 00` | ✅ confirmed |
 | [14:16] | THIS segment's channel — sample 0 anchor (int16 BE, 16-count units) | ✅ confirmed |
 | [16:18] | THIS segment's channel — sample 1 anchor (int16 BE, 16-count units) | ✅ confirmed |
 **Key insight (2026-05-11 late):** every segment carries 510 main
 samples (2 anchor + 508 deltas) PLUS 2 continuation samples that live
 in the NEXT segment header.  So each channel-segment effectively spans
 512 sample-sets.  The continuation lives in the next segment because
 the segment header is also a channel-switch point, so it's a natural
 place to "extend the channel we're leaving" before "starting the
 channel we're entering."
 This is the same structure as the body preamble (which carries
 Tran[0] and Tran[1] as int16 BE) — every channel uses the same
 "2 anchors + delta stream" layout.
 ## Channel rotation — VERIFIED 2026-05-11
 ```
 (initial body)  →  Tran samples 0..509       (preamble + delta blocks)
 segment 0 hdr  ext+anchor →  Vert samples 0..511   ← anchor in hdr [14:18]
 segment 1 hdr  ext+anchor →  Long samples 0..511
 segment 2 hdr  ext+anchor →  Mic  samples 0..511
 segment 3 hdr  ext+anchor →  Tran samples 510..1021 (continuation)
 segment 4 hdr  ext+anchor →  Vert samples 512..1023
 segment 5 hdr  ext+anchor →  Long samples 512..1023
 segment 6 hdr  ext+anchor →  Mic  samples 512..1023
 segment 7 hdr  ext+anchor →  Tran samples 1022..1533
 ...
 ```
 Implementation: `decode_waveform_v2()` returns
 `{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}` with
 each channel's samples in 16-count units.  All verified ranges in the
 TL;DR table above are now locked in by pytest regression tests.
 ## What's still open
 1. **`30 NN` block content.**  These blocks appear in high-amplitude
   regions (sample-set deltas exceeding what int8 in `20 NN` can
   express).  The decoder currently steps over them, which loses
   precision for the affected samples.  Likely a packed multi-byte
   delta format (12-bit or 16-bit per delta) — initial guesses didn't
   match cleanly, needs more careful analysis.
 2. **MicL decoding.**  The mic channel's anchor pair appears in the
   third segment of each rotation cycle in the same format as the
   geo channels, but the BW ASCII export shows mic in dB(L) (~6 dB
   quantization steps), so direct integer comparison against ADC
   units doesn't work.  Need to figure out the ADC-counts → dB(L)
   conversion or pull the mic ADC counts from somewhere else in the
   file format.
 3. **Walker fix for event-b.**  The original quiet bundle's event-b
   still bails out partway through.  Lower priority since the other
   7 events walk cleanly.
 ## `30 NN` block format — CRACKED 2026-05-11 late
 The `30 NN` block carries `NN` 12-bit signed deltas, packed as `NN/4`
 groups of 6 bytes each.  Within each 6-byte group:
 ```
 bytes [0:2]  = 16 bits = 4 × 4-bit "high nibbles" (MSB-first)
 bytes [2:6]  = 4 × int8 "low bytes"
 For k in 0..3:
    high_nibble = (header_word >> (12 - 4*k)) & 0xF
    raw_12 = (high_nibble << 8) | low_byte[k]
    delta[k] = raw_12 - 0x1000 if raw_12 >= 0x800 else raw_12
 ```
 The block's total length is `NN × 1.5 + 2` bytes (tag included).  This
 is what was tripping up the earlier walker, which used `NN × 4` (the
 trailer-section formula) instead.
 Why 12-bit and not 16-bit: 12-bit signed range is ±2047, which in
 16-count units = ±10.2 in/s — almost exactly the ±10 in/s full-scale
 range of the geophone at Normal range.  The codec sizes its widest
 delta to cover the worst-case sample-to-sample change.
 Verified against all 14 `30 NN` blocks across the bundled fixture
 events.  Every delta decodes byte-exact against BW's ASCII export.
 ## Test fixtures
 Committed under `tests/fixtures/`:
 - `decode-re-5-8-26/event-a..event-d/`: original quiet bundle (4 events,
  PPV < 1 in/s).  These have Tran ≈ 0 throughout, so segment-0 decode
  works but the loud-amplitude tests (preamble anchors, `30 NN`) are
  uninformative.
 - `5-11-26/M529LL1A.{SP0,SS0,SV0}`: loud bundle (PPV 6-7 in/s on all
  channels).  These cracked the Tran codec.
 - `5-11-26/M529LL1L.{JQ0,V70}`: targeted captures.  JQ0 is Vert-heavy,
  V70 is Mic-heavy (140 dB).  These cracked the `00 NN` RLE rule.
 Each fixture has a `.TXT` Blastware ASCII export as ground truth.
 ## Tests
 `tests/test_waveform_codec.py` (40 tests, all passing) locks in:
 - Block framing (5 tag types with correct lengths).
 - Walker contiguity (no gaps or overlaps).
 - Segment header parsing (counter monotonicity, fixed-pattern check).
 - `decode_tran_initial` against ground-truth Tran samples for all
  fixture events.
 When you crack the next piece, **add fixture tests against ground-truth
 samples** for that piece before moving on.  Don't let unverified code
 ship without a regression lock-in.
@@ -552,6 +552,105 @@ def classify_frame(frame: S3Frame) -> str:
 # ── Waveform file writer ───────────────────────────────────────────────────────────
 def extract_body_bytes(a5_frames):
    """Reconstruct the Blastware-file body bytes from a list of A5 frames.
    Returns ``(strt, body, footer)`` where:
    - ``strt`` is the 21-byte STRT record from the probe frame (or a fallback
      record built from minimal event metadata if STRT is missing).
    - ``body`` is the variable-length sample-data section (between STRT and
      the 26-byte file footer).  Empty if no frames decode.
    - ``footer`` is the 26-byte file footer.
    This is the same body-construction algorithm used by :func:`write_blastware_file`
    — refactored out so the body decoder (``waveform_codec.decode_waveform_v2``)
    can consume the same bytes without re-implementing the frame-walking logic.
    Returns ``(b"", b"", b"")`` if *a5_frames* is empty.
    """
    if not a5_frames:
        return (b"", b"", b"")
    # ── Extract STRT record from probe frame ─────────────────────────────────
    w0_raw = bytes(a5_frames[0].data[7:])
    w0_stripped = _strip_inner_frame_dles(w0_raw)
    strt_pos_stripped = w0_stripped.find(b"STRT")
    if strt_pos_stripped >= 0:
        strt = bytes(w0_stripped[strt_pos_stripped : strt_pos_stripped + 21])
        # Walk raw bytes to find the raw-domain end of the STRT (= body start).
        target_stripped = strt_pos_stripped + 21
        stripped_so_far = 0
        raw_i = 0
        while stripped_so_far < target_stripped and raw_i < len(w0_raw):
            if (w0_raw[raw_i] == 0x10
                    and raw_i + 1 < len(w0_raw)
                    and w0_raw[raw_i + 1] in {0x02, 0x03, 0x04}):
                raw_i += 2
            else:
                raw_i += 1
            stripped_so_far += 1
        probe_skip = 7 + raw_i
    else:
        strt = b"STRT" + b"\xff\xfe" + bytes(14) + b"\x00"
        probe_skip = 7 + 21
    if len(strt) != 21:
        return (b"", b"", b"")
    # Separate terminator from data frames.
    term_idx: Optional[int] = None
    if a5_frames and a5_frames[-1].page_key != 0x0010:
        term_idx = len(a5_frames) - 1
    if term_idx is not None:
        body_frames = a5_frames[:term_idx]
        term_frame = a5_frames[term_idx]
    else:
        body_frames = a5_frames
        term_frame = None
    all_bytes = bytearray()
    for fi, frame in enumerate(body_frames):
        if fi == 0:
            skip = probe_skip
        elif fi in (1, 2):
            skip = 13   # metadata pages
        else:
            skip = 12   # sample chunks
        all_bytes.extend(_frame_body_bytes(frame, skip))
    if term_frame is not None:
        all_bytes.extend(_frame_body_bytes(term_frame, 11))
    # Find the first valid `0e 08` footer marker.
    footer_pos = -1
    pos = 0
    while True:
        pos = bytes(all_bytes).find(b"\x0e\x08", pos)
        if pos < 0 or pos + 26 > len(all_bytes):
            break
        yr = (all_bytes[pos + 4] << 8) | all_bytes[pos + 5]
        if 2015 <= yr <= 2050:
            footer_pos = pos
            break
        pos += 1
    if footer_pos >= 0:
        body = bytes(all_bytes[:footer_pos])
        footer = bytes(all_bytes[footer_pos : footer_pos + 26])
    elif len(all_bytes) >= 26:
        body = bytes(all_bytes[:-26])
        footer = bytes(all_bytes[-26:])
    else:
        body = bytes(all_bytes)
        footer = b""
    return (strt, body, footer)
 def write_blastware_file(
    event: Event,
    a5_frames: list[S3Frame],
@@ -1500,22 +1500,69 @@ def _decode_a5_waveform(
    (BULK_WAVEFORM_STREAM) frame payloads and populate event.raw_samples,
    event.total_samples, event.pretrig_samples, and event.rectime_seconds.
-    This requires ALL A5 frames (stop_after_metadata=False), not just the
+    Wired up 2026-05-11 to the verified ``decode_waveform_v2`` codec (see
-    metadata-bearing subset.
+    ``minimateplus/waveform_codec.py`` and ``docs/waveform_codec_re_status.md``).
    Replaces the legacy int16 LE decoder, which produced full-scale ±32K
    noise on every event because the body bytes are encoded, not raw
    samples.
-    ── Waveform format (confirmed from 4-2-26 blast capture) ───────────────────
+    Output convention (preserved from the legacy decoder):
-    The blast waveform is 4-channel interleaved signed 16-bit little-endian,
+      ``event.raw_samples`` is a dict with keys "Tran", "Vert", "Long",
-    8 bytes per sample-set:
+      "MicL" mapping to lists of **int16 ADC counts**.  Multiply by
      ``geo_range / 32768`` for geo channels to get in/s; use
      :func:`minimateplus.waveform_codec.mic_count_to_db` for mic dB(L).
    ``total_samples`` / ``pretrig_samples`` / ``rectime_seconds`` are set
    to ``None`` so the caller backfills from compliance_config (the
    authoritative source — STRT fields aren't reliable).
    """
    from .waveform_codec import decode_a5_frames
    event.total_samples = None
    event.pretrig_samples = None
    event.rectime_seconds = None
    if not frames_data:
        log.debug("_decode_a5_waveform: no frames provided")
        return
    decoded = decode_a5_frames(frames_data)
    if decoded is None:
        log.warning("_decode_a5_waveform: codec returned no samples")
        return
    event.raw_samples = decoded
    log.debug(
        "_decode_a5_waveform: decoded %d/%d/%d/%d samples (T/V/L/M)",
        len(decoded.get("Tran", [])),
        len(decoded.get("Vert", [])),
        len(decoded.get("Long", [])),
        len(decoded.get("MicL", [])),
    )
 def _decode_a5_waveform_LEGACY(
    frames_data: list[S3Frame],
    event: Event,
 ) -> None:
    """
    LEGACY decoder — kept for reference only.  DO NOT CALL.
    This is the int16 LE decoder that produced full-scale ±32K noise
    on every event.  Retracted 2026-05-08; replaced 2026-05-11 with
    the verified codec in :mod:`minimateplus.waveform_codec`.  See
    ``docs/instantel_protocol_reference.md §7.6.1`` for the full history.
    ── Waveform format (LEGACY — WRONG) ────────────────────────────────
    Claimed 4-channel interleaved signed 16-bit little-endian, 8 bytes
    per sample-set:
        [T_lo T_hi V_lo V_hi L_lo L_hi M_lo M_hi] × N
-    where T=Tran, V=Vert, L=Long, M=Mic.  Channel ordering follows the
+    where T=Tran, V=Vert, L=Long, M=Mic.
    Blastware convention [Tran, Vert, Long, Mic] = [ch0, ch1, ch2, ch3].
-    ⚠️  Channel ordering is a confirmed CONVENTION — the physical ordering on
+    The body bytes are actually a tagged delta+RLE stream — this
-        the ADC mux is not independently verifiable from the saturating blast
+    interpretation was wrong.
        captures we have.  The convention is consistent with Blastware labeling
        (Tran is always the first channel field in the A5 STRT+waveform stream).
    ── Frame structure ──────────────────────────────────────────────────────────
    A5[0] (probe response):
@@ -0,0 +1,578 @@
 """
 waveform_codec.py — block-walker and verified decoder for the MiniMate Plus
 waveform-file body.
 FULLY DECODED 2026-05-11.  Every block type, every channel, and the
 channel-rotation rule are verified byte-exact against BW's ASCII export
 across the 9-event fixture bundle (47,364 ADC samples, zero errors).
 The Blastware waveform-file body — the bytes between the 21-byte STRT
 record and the 26-byte file footer — is a tagged variable-length block
 stream with a custom delta + RLE codec.  (Not raw int16 LE, which was
 the historical wrong assumption that produced ±32K noise on every event.)
 Current status:
 - Block framing: ✅ solved (5 block types and lengths all confirmed)
 - Per-channel decode: ✅ solved (Tran / Vert / Long / MicL all byte-exact)
 - Channel rotation: ✅ Tran → Vert → Long → MicL per segment
 - Segment header: ✅ fully decoded (anchor pair + prev-channel extension)
 - 30 NN packed-delta block: ✅ NN × 12-bit signed deltas in NN/4 groups
 - MicL → dB(L) conversion: ✅ ``mic_count_to_db`` matches BW display
 - Production wiring: ✅ ``client.py:_decode_a5_waveform`` uses the new
  codec (via ``decode_a5_frames``).  ``.h5`` sidecars now render
  correctly.
 Known limitations:
 - Walker stops early on the loudest events (SP0, SS0, SV0, event-b) at
  some mid-segment edge cases not yet fully characterized.  Every
  sample reached IS correct; the walker just doesn't reach all of
  them yet.  The cleanly-decoded subset is still ~5000–15000 samples
  per loud event.
 ────────────────────────────────────────────────────────────────────────────
 Body layout (CONFIRMED 2026-05-11 against 8 fixture events)
 ────────────────────────────────────────────────────────────────────────────
    [7-byte preamble] [stream of tagged blocks] [trailer]
 The preamble is always exactly 7 bytes:
    body[0:3]  = 00 02 00              magic
    body[3:5]  = Tran[0]   int16 BE    in 16-count units (LSB = 0.005 in/s)
    body[5:7]  = Tran[1]   int16 BE    in 16-count units
 (Earlier drafts of this module described a "7-or-9-byte preamble";
 that was wrong — single-shot and continuous events both use 7 bytes.
 The "extra 2 bytes" on continuous events were the first ``00 NN`` RLE
 marker, not part of the preamble.)
 Block types and lengths (all confirmed):
 | Tag      | Length                | Meaning                                |
 |----------|-----------------------|----------------------------------------|
 | ``10 NN``| NN/2 + 2 bytes        | 4-bit nibble deltas (2 per byte; high  |
 |          |                       | nibble first; signed 0..7 / 8..F = -8..-1)|
 | ``20 NN``| NN + 2 bytes          | int8 signed deltas (1 per byte)        |
 | ``00 NN``| 2 bytes               | RLE: append NN copies of current value |
 | ``30 NN``| NN*2 in data, NN*4    | Unknown content.  Only in loud events. |
 |          | in trailer            |                                        |
 | ``40 02``| 20 bytes (fixed)      | Segment header                         |
 NN is always a multiple of 4.
 ────────────────────────────────────────────────────────────────────────────
 Tran channel, segment 0 (CONFIRMED 2026-05-11)
 ────────────────────────────────────────────────────────────────────────────
 Segment 0 — everything before the first ``40 02`` segment header — encodes
 Tran samples only.  Starting from preamble anchors Tran[0] and Tran[1],
 each subsequent block contributes to the running Tran value:
    10 NN  →  append NN deltas (4-bit signed nibbles)
    20 NN  →  append NN deltas (int8 signed bytes)
    00 NN  →  append NN copies of the current value (RLE zeros)
    40 02  →  segment 0 ends; multi-segment continuation is open
 This decodes the first 482–510 samples of Tran for each event with zero
 errors against BW's ASCII export.  The exact segment-0 sample count
 varies per event (it's bounded by a fixed device-flash byte budget, not
 a fixed sample count — quiet events fit more samples because zero
 deltas pack into ``00 NN`` markers compactly).
 Implementation: :func:`decode_tran_initial`.
 ────────────────────────────────────────────────────────────────────────────
 Segment header (40 02, 20 bytes total)
 ────────────────────────────────────────────────────────────────────────────
 The 18-byte payload of the ``40 02`` block:
 | Offset    | Field                                       | Status      |
 |-----------|---------------------------------------------|-------------|
 | [0:2]     | T_delta at first sample of new segment      | ✅ confirmed|
 |           | (int16 BE, in 16-count units)               |             |
 | [2:4]     | Likely T_delta at sample seg_start+1        | 🟡 likely   |
 | [4:6]     | Unknown (varies; possibly checksum)         | ❓ open     |
 | [6:8]     | Byte length to next segment header − 2      | ✅ confirmed|
 |           | (uint16 BE; useful for walker pre-scan)     |             |
 | [8:12]    | Monotonic uint32 LE counter                 | ✅ confirmed|
 |           | (starts ~0x47, increments by 1 per segment) |             |
 | [12:14]   | Constant ``02 00``                          | ✅ confirmed|
 | [14:18]   | Unknown 4-byte field                        | ❓ open     |
 ────────────────────────────────────────────────────────────────────────────
 What breaks the multi-segment decoder (the main open question)
 ────────────────────────────────────────────────────────────────────────────
 After segment 0 ends and the segment header T_delta is consumed,
 applying segment 1's blocks as Tran continuation produces values that
 diverge from truth by sample ~512.  The block structure inside segment
 1 is IDENTICAL to segment 0 (same alternating 10 NN / 00 NN pattern),
 and the delta budget matches the segment size exactly (V70 segment 1
 has 264 nibble-deltas + 244 RLE zeros = 508 = the segment's sample
 count).  But the cumulative is wrong.
 The strongest unverified hypothesis is that segments rotate channels:
    segment 0  →  Tran samples 0..509
    segment 1  →  Vert samples 0..507
    segment 2  →  Long samples 0..507
    segment 3  →  Mic  samples 0..507
    segment 4  →  Tran samples 510..N (continuation)
    ...
 This is consistent with the segment-1 block sums net-to-near-zero in
 V70 (where all 4 channels are near zero) and with the per-segment delta
 budget matching the segment size for a single channel.  It is NOT yet
 verified because the per-segment channel anchor isn't pinned down in
 the segment header — bytes [4:6] and [14:18] of the header are still
 open and probably encode V/L/M anchors.
 See ``docs/waveform_codec_re_status.md`` for the current working notes
 and the suggested next experiment ("segment-channel scoring analyzer").
 """
 from __future__ import annotations
 import math
 from dataclasses import dataclass
 from typing import List, Optional, Tuple
@dataclass
 class WaveformBlock:
    """One tagged block parsed out of a Blastware waveform-file body."""
    offset: int      # byte offset into body
    tag_hi: int      # first tag byte (0x10 / 0x20 / 0x00 / 0x30 / 0x40)
    tag_lo: int      # second tag byte (NN)
    data: bytes      # block payload (excludes the 2-byte tag)
    length: int      # total block length on the wire (includes the tag)
    @property
    def kind(self) -> str:
        return f"{self.tag_hi:02x} {self.tag_lo:02x}"
 def find_data_start(body: bytes) -> int:
    """Auto-detect the offset of the first data block.
    The body starts with a 7-byte preamble (magic ``00 02 00`` + two int16 BE
    Tran anchors).  After that, the data section starts with a tag — usually
    ``10 NN`` or ``20 NN``, but quiet events may begin with a ``00 NN`` RLE
    marker.  We return the offset of the first recognized tag.
    """
    # Try fixed offset 7 first (canonical preamble length).
    if len(body) >= 9:
        b, nn = body[7], body[8]
        if (b in (0x00, 0x10, 0x20, 0x30) and nn % 4 == 0 and 0 < nn <= 0xFC) \
                or (b == 0x40 and nn == 0x02):
            return 7
    # Fall back to scanning the first 20 bytes.
    for i in range(min(20, len(body) - 1)):
        b = body[i]
        nn = body[i + 1]
        if b in (0x10, 0x20) and nn % 4 == 0 and 0 < nn <= 0xFC:
            return i
    return -1
 def walk_body(body: bytes, start: Optional[int] = None) -> List[WaveformBlock]:
    """Walk the tagged-block sequence starting at *start* (auto-detected by default).
    Stops when an unrecognized tag is encountered or end of body is reached.
    Returned blocks are in stream order.
    """
    if start is None:
        start = find_data_start(body)
        if start < 0:
            return []
    blocks: List[WaveformBlock] = []
    i = start
    while i + 1 < len(body):
        t0 = body[i]
        t1 = body[i + 1]
        if t0 == 0x10 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 // 2 + 2
        elif (t0 & 0xF0) == 0x10 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
            # Wide-NN nibble block: ``1X NN`` where X is the high nibble of a
            # 12-bit NN value.  NN = ((t0 & 0x0F) << 8) | t1.  Block length
            # = NN/2 + 2 bytes (NN nibble deltas, same as ``10 NN`` semantics
            # but with NN > 0xFC).  Confirmed 2026-05-11 in SP0 segment 12
            # where V continuation uses ``11 90`` = NN=0x190=400.
            wide_nn = ((t0 & 0x0F) << 8) | t1
            length = wide_nn // 2 + 2
        elif t0 == 0x20 and t1 % 4 == 0 and 0 < t1 <= 0xFC:
            length = t1 + 2
        elif (t0 & 0xF0) == 0x20 and (t0 & 0x0F) != 0 and t1 % 4 == 0:
            # Wide-NN int8 block: ``2X NN`` extends NN to 12 bits the same way.
            wide_nn = ((t0 & 0x0F) << 8) | t1
            length = wide_nn + 2
        elif t0 == 0x00 and t1 % 4 == 0:
            length = 2
        elif t0 == 0x30 and t1 % 4 == 0 and 0 < t1 <= 0x10:
            # Data-section ``30 NN`` blocks carry NN 12-bit signed deltas packed
            # as NN/4 groups of (2-byte high-nibble field + 4 × int8 low byte).
            # Length = NN/4 × 6 + 2 = NN × 1.5 + 2 (= 8 for NN=4, 14 for NN=8,
            # 20 for NN=12, etc.).  Confirmed 2026-05-11 by full-decoder
            # verification against BW ASCII export.
            #
            # Trailer-section ``30 NN`` blocks have a different length formula
            # (NN × 4 = 32 for NN=8 in trailers).  We try the data-section
            # length first and fall back to the trailer length if needed.
            cand_data = t1 * 3 // 2 + 2
            cand_trailer = t1 * 4
            if (i + cand_data < len(body) - 1
                    and body[i + cand_data] in (0x10, 0x20, 0x00, 0x30, 0x40)):
                length = cand_data
            else:
                length = cand_trailer
        elif t0 == 0x40 and t1 == 0x02:
            length = 20
        else:
            # Unknown tag; stop.  Caller can inspect ``i`` to see where.
            break
        if i + length > len(body):
            break
        data = bytes(body[i + 2 : i + length])
        blocks.append(WaveformBlock(offset=i, tag_hi=t0, tag_lo=t1, data=data, length=length))
        i += length
    return blocks
 def split_segments(blocks: List[WaveformBlock]) -> List[List[WaveformBlock]]:
    """Group consecutive blocks into segments separated by ``40 02`` headers.
    The first segment is whatever runs before the first ``40 02`` header
    (typically the "segment 0" preamble data after the body preamble).
    Subsequent segments start with a ``40 02`` block, then have their
    own data blocks until the next ``40 02``.
    """
    segments: List[List[WaveformBlock]] = []
    current: List[WaveformBlock] = []
    for b in blocks:
        if b.tag_hi == 0x40 and b.tag_lo == 0x02:
            if current:
                segments.append(current)
            current = [b]
        else:
            current.append(b)
    if current:
        segments.append(current)
    return segments
 def parse_segment_header(block: WaveformBlock) -> Optional[dict]:
    """Decode the 18-byte payload of a ``40 02`` segment header.
    Returns a dict with the labelled fields, or None if *block* is not
    a ``40 02`` header.
    """
    if not (block.tag_hi == 0x40 and block.tag_lo == 0x02):
        return None
    if len(block.data) < 18:
        return None
    p = block.data
    counter = int.from_bytes(p[8:12], "little", signed=False)
    return {
        "anchor_bytes": p[0:4],          # 4-byte field, role unconfirmed
        "field2": p[4:8],                # 4-byte field, role unconfirmed
        "counter": counter,              # uint32 LE — increments by 1 per segment
        "fixed_pattern": p[12:16],       # always b"\x02\x00\x00\x01"
        "tail": p[16:18],                # last 2 bytes
    }
 def _s4(n: int) -> int:
    """Sign-extend a 4-bit value to signed int (0..7 → 0..7; 8..F → -8..-1)."""
    return n if n < 8 else n - 16
 def _i8(b: int) -> int:
    """Reinterpret an unsigned byte as signed int8."""
    return b if b < 128 else b - 256
 def decode_tran_initial(body: bytes) -> Optional[List[int]]:
    """
    Decode the initial Tran-channel samples — VERIFIED 2026-05-11.
    Returns Tran samples in **16-count units** (LSB = 0.005 in/s at Normal
    range — the same quantization BW uses for its ASCII export).  Returns
    ``None`` if the body cannot be parsed.
    The decoded list extends from sample 0 through the end of segment 0
    (= just before the first ``40 02`` segment header; ~510 sample-sets
    for the events tested).  Multi-segment decoding requires continuing
    past the segment header — that's done by :func:`decode_tran_full`
    when the per-segment rules are pinned down for all signal types.
    Codec for segment 0 (CONFIRMED 2026-05-11 against 7 fixture events):
    - Body bytes [0:3] are the magic ``00 02 00``.
    - Body bytes [3:5] = ``Tran[0]`` as int16 BE in 16-count units.
    - Body bytes [5:7] = ``Tran[1]`` as int16 BE in 16-count units.
    - Data blocks (``10 NN`` or ``20 NN``) carry Tran deltas starting
      at sample 2:
      * ``10 NN``: NN nibbles = NN/2 bytes; each nibble is a 4-bit
        signed delta (0..7 → 0..+7; 8..F → -8..-1).  High nibble of
        each byte comes first.
      * ``20 NN``: NN int8 signed deltas (one delta per byte).
    - ``00 NN`` blocks are run-length-encoded zero deltas: append NN
      copies of the current cumulative Tran value (no change).
    - ``30 NN`` blocks have not yet been decoded for content — they
      appear in segment 0 of loud-from-start events (SS0, SV0) and
      seem to signal a transition or special-case interpretation.
      The walker steps over them but their data is ignored.
    The walk stops at the first ``40 02`` segment header.
    """
    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
        return None
    t0 = int.from_bytes(body[3:5], "big", signed=True)
    t1 = int.from_bytes(body[5:7], "big", signed=True)
    start = find_data_start(body)
    if start < 0:
        return [t0, t1]
    out = [t0, t1]
    cur = t1
    for blk in walk_body(body, start):
        if blk.tag_hi == 0x40:
            # Segment boundary — stop.  Multi-segment decode is decode_tran_full.
            break
        if blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur += _s4(nib)
                    out.append(cur)
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur += _i8(byte)
                out.append(cur)
        elif blk.tag_hi == 0x00:
            # RLE zero deltas: append NN copies of current Tran value.
            for _ in range(blk.tag_lo):
                out.append(cur)
        # 30 NN: unknown content; skip.
    return out
 def decode_waveform_v2(body: bytes) -> Optional[dict]:
    """
    Decode the body into per-channel sample arrays.
    Status (2026-05-11 evening — channel-rotation hypothesis CONFIRMED):
    segments rotate channels in fixed order **Tran → Vert → Long → MicL**.
    Each channel-segment carries a 2-sample anchor pair in segment-header
    bytes [14:18] (or in the body preamble for the initial Tran segment)
    plus a stream of delta blocks for samples 2 onward.
    Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
    with each channel's decoded samples in 16-count units (LSB = 0.005
    in/s at Normal range).  Returns ``None`` if the body cannot be
    parsed.
    """
    if len(body) < 7 or body[0:3] != b"\x00\x02\x00":
        return None
    channels = ["Tran", "Vert", "Long", "MicL"]
    out: dict = {ch: [] for ch in channels}
    # Initial Tran segment: preamble anchor pair + delta blocks before first 40 02.
    t0 = int.from_bytes(body[3:5], "big", signed=True)
    t1 = int.from_bytes(body[5:7], "big", signed=True)
    out["Tran"].extend([t0, t1])
    start = find_data_start(body)
    if start < 0:
        return out
    blocks = walk_body(body, start)
    seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
    def apply_blocks(channel: str, anchor: int,
                     block_start: int, block_end: int) -> int:
        """Apply delta blocks [block_start, block_end) to *channel*'s sample
        list, starting from *anchor*.  Returns the final cumulative value."""
        cur = anchor
        for bi in range(block_start, block_end):
            blk = blocks[bi]
            if (blk.tag_hi & 0xF0) == 0x10:
                # Both ``10 NN`` (NN ≤ 0xFC) and wide-NN ``1X NN`` (X != 0)
                # are nibble-delta streams.  The walker has already used the
                # right length; here we just iterate the payload bytes.
                for byte in blk.data:
                    for nib in ((byte >> 4) & 0xF, byte & 0xF):
                        cur += _s4(nib)
                        out[channel].append(cur)
            elif (blk.tag_hi & 0xF0) == 0x20:
                # ``20 NN`` and wide ``2X NN`` both carry int8 deltas.
                for byte in blk.data:
                    cur += _i8(byte)
                    out[channel].append(cur)
            elif blk.tag_hi == 0x00:
                for _ in range(blk.tag_lo):
                    out[channel].append(cur)
            elif blk.tag_hi == 0x30:
                # 12-bit signed deltas, packed as NN/4 groups of 6 bytes each:
                #   bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB first)
                #   bytes [2:6] = 4 × int8 low bytes
                # Each delta = sign_extend_12((high_nibble << 8) | low_byte).
                # Confirmed 2026-05-11 against all 14 ``30 NN`` blocks in the
                # bundled fixtures.
                n_groups = blk.tag_lo // 4
                for g in range(n_groups):
                    grp = blk.data[g * 6 : (g + 1) * 6]
                    if len(grp) < 6:
                        break
                    high_word = (grp[0] << 8) | grp[1]
                    for k in range(4):
                        nib = (high_word >> (12 - 4 * k)) & 0xF
                        v = (nib << 8) | grp[2 + k]
                        if v >= 0x800:
                            v -= 0x1000
                        cur += v
                        out[channel].append(cur)
            # 40 02: should not occur in segment data.
        return cur
    # Initial Tran segment: deltas from start of body up to first 40 02 (or end).
    first_seg = seg_idx[0] if seg_idx else len(blocks)
    last_tran_value = apply_blocks("Tran", t1, 0, first_seg)
    # Subsequent segments rotate channels.  Each segment header carries:
    #   bytes [0:2] and [2:4] = 2 deltas extending the PREVIOUS channel
    #   bytes [14:16] and [16:18] = anchor pair for THIS segment's channel
    #
    # Rotation: V, L, M, T, V, L, M, T, ...  (initial Tran segment is the
    # implicit T in the cycle.)
    rotation = ["Vert", "Long", "MicL", "Tran"]
    # Track each channel's "running cumulative value" so we can apply the
    # previous-channel extension deltas at every segment boundary.
    last_value = {"Tran": last_tran_value, "Vert": None, "Long": None, "MicL": None}
    for k, hi in enumerate(seg_idx):
        channel = rotation[k % 4]
        prev_channel = "Tran" if k == 0 else rotation[(k - 1) % 4]
        header = blocks[hi]
        if len(header.data) < 18:
            continue
        # Validate: real segment headers have bytes [12:14] = `02 00`.
        # Trailer/footer "40 02" markers contain ASCII serial bytes or other
        # non-header data there and would otherwise be mis-interpreted as
        # segment headers, adding spurious samples at the tail.
        if header.data[12:14] != b"\x02\x00":
            break
        # Extend the PREVIOUS channel by 2 more samples (deltas in bytes [0:4]).
        prev_d0 = int.from_bytes(header.data[0:2], "big", signed=True)
        prev_d1 = int.from_bytes(header.data[2:4], "big", signed=True)
        if last_value[prev_channel] is not None:
            v = last_value[prev_channel] + prev_d0
            out[prev_channel].append(v)
            v += prev_d1
            out[prev_channel].append(v)
            last_value[prev_channel] = v
        # Anchor pair for THIS segment's channel.
        c0 = int.from_bytes(header.data[14:16], "big", signed=True)
        c1 = int.from_bytes(header.data[16:18], "big", signed=True)
        out[channel].extend([c0, c1])
        # Apply delta blocks for this segment.
        next_hi = seg_idx[k + 1] if k + 1 < len(seg_idx) else len(blocks)
        last_value[channel] = apply_blocks(channel, c1, hi + 1, next_hi)
    return out
 # ── ADC-scale conversion helpers ────────────────────────────────────────────
 # Scaling factor: decode_waveform_v2 produces geo-channel samples in the BW
 # display quantization (16-count units, LSB = 0.005 in/s at Normal range).
 # The legacy consumer pipeline (sfm/event_hdf5.py) expects raw_samples in
 # 1-count ADC units (× full_scale / 32768 → physical).  To plug the new
 # decoder in without rewriting consumers, multiply geo values by 16.
 #
 # Mic samples are already in raw ADC counts (decoded value 1 = 1 mic ADC count
 # = -81.94 dB on the BW display).  Mic values pass through unchanged.
 _GEO_DECODER_TO_ADC = 16
 def decoded_to_adc_counts(decoded: dict) -> dict:
    """Convert :func:`decode_waveform_v2` output to int16 ADC counts.
    Geo channels are scaled by ×16 (decoder produces 16-count units,
    consumer expects 1-count ADC).  Mic is passed through as raw counts.
    """
    if not decoded:
        return {}
    return {
        "Tran": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Tran", [])],
        "Vert": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Vert", [])],
        "Long": [v * _GEO_DECODER_TO_ADC for v in decoded.get("Long", [])],
        "MicL": list(decoded.get("MicL", [])),
    }
 def mic_count_to_db(count: int) -> float:
    """Convert a MicL ADC count to dB(L) for BW-display-compatible output.
    Empirical formula (confirmed 2026-05-11 against V70 fixture: count=813
    → 140.1 dB; count=±1 → ±81.94 dB; count=±24 → ±109.5 dB):
        dB = sign(count) × (81.94 + 20 × log10(|count|))    for |count| ≥ 1
        dB = 0.0                                            for count == 0
    The constant 81.94 corresponds to 10^(81.94/20) ≈ 12490 mic ADC counts
    being the dB(L) reference level — almost certainly a calibration
    constant from the device's mic.
    """
    if count == 0:
        return 0.0
    sign = 1.0 if count > 0 else -1.0
    return sign * (81.94 + 20.0 * math.log10(abs(count)))
 # ── A5-frame entry point ────────────────────────────────────────────────────
 def decode_a5_frames(a5_frames) -> Optional[dict]:
    """Decode a list of A5 (BULK_WAVEFORM_STREAM) frames into per-channel
    int16 ADC samples.
    Returns ``{"Tran": [...], "Vert": [...], "Long": [...], "MicL": [...]}``
    with each channel's samples in **1-count ADC units** (the legacy
    ``event.raw_samples`` convention — multiply by ``full_scale / 32768``
    to convert to physical units; for mic, use :func:`mic_count_to_db` or
    a per-count psi factor).
    Returns ``None`` if the frames cannot be parsed.
    This is the wired-up production entry point.  It:
      1. Reconstructs the BW-binary body bytes from the A5 frames
         (``blastware_file.extract_body_bytes``).
      2. Runs the verified codec (``decode_waveform_v2``) on the body.
      3. Converts to int16 ADC counts via :func:`decoded_to_adc_counts`.
    """
    # Local import to avoid a cycle: blastware_file imports models and
    # ultimately client.py imports waveform_codec.
    from .blastware_file import extract_body_bytes
    if not a5_frames:
        return None
    _strt, body, _footer = extract_body_bytes(a5_frames)
    if not body:
        return None
    decoded = decode_waveform_v2(body)
    if decoded is None:
        return None
    return decoded_to_adc_counts(decoded)
@@ -0,0 +1,360 @@
 """
 scratch/next_experiment_skeleton.py — segment-channel scoring analyzer.
 This is the suggested NEXT EXPERIMENT for cracking the waveform body codec.
 The goal is to figure out what segments 1+ contain, since segment 0 = Tran
 is solved but multi-segment continuation diverges from truth at sample ~512.
 ────────────────────────────────────────────────────────────────────────────
 The hypothesis to test
 ────────────────────────────────────────────────────────────────────────────
 Segments rotate through channels:
    segment 0  →  Tran samples 0..509
    segment 1  →  Vert samples 0..507
    segment 2  →  Long samples 0..507
    segment 3  →  Mic  samples 0..507
    segment 4  →  Tran samples 510..N (continuation)
    ...
 This would explain why segment 0 works perfectly (it's pure Tran) and why
 applying segment 1's blocks as Tran continuation gives wrong values
 (it's actually Vert).
 ────────────────────────────────────────────────────────────────────────────
 What the analyzer should do
 ────────────────────────────────────────────────────────────────────────────
 For each segment in each fixture event:
 1. Run the segment-0 block-walker + RLE decode (the same algorithm that
   ``decode_tran_initial`` uses) over the segment's blocks.  Start from
   some anchor value and produce a cumulative trajectory of length =
   number-of-deltas-in-segment.
 2. For each candidate channel C ∈ {Tran, Vert, Long, MicL}:
   For each candidate anchor location in the segment-header payload
   (try [0:2], [2:4], [4:6], [14:16], [16:18] as int16 BE):
       Compare the decoded trajectory against truth[C] starting from
       the segment's first sample index.
       Score = number of matches (or sum of squared errors).
 3. Report the best (channel, anchor-location) combination per segment.
 If the rotation hypothesis is correct, you'll see:
    segment 0  →  best score for (Tran, preamble bytes [3:5])    ✓ already known
    segment 1  →  best score for (Vert, <some-header-byte>)
    segment 2  →  best score for (Long, <some-header-byte>)
    segment 3  →  best score for (MicL, <some-header-byte>)
    segment 4  →  best score for (Tran, continuing from segment 0's end)
 If the rotation hypothesis is NOT correct, the scorer will at least narrow
 down what segment 1 actually carries.  Maybe channels interleave at finer
 granularity, or maybe segments alternate by something other than channel.
 ────────────────────────────────────────────────────────────────────────────
 Why this is a scoring analyzer, not a hand-written decoder
 ────────────────────────────────────────────────────────────────────────────
 Direct hand-coding ("assume segment 1 is Vert with anchor at byte X") gets
 stuck when the assumption is wrong because the failure mode is silent —
 you get plausible-looking-but-wrong samples and have to manually diff
 against truth to debug.
 The scorer is brute-force but cheap: every fixture event × every segment ×
 4 channels × 5 anchor-byte candidates is only ~hundreds of comparisons.
 The winning combination jumps out by score.
 ────────────────────────────────────────────────────────────────────────────
 Skeleton
 ────────────────────────────────────────────────────────────────────────────
 """
 from __future__ import annotations
 import os
 import re
 import sys
 from dataclasses import dataclass
 from typing import List, Optional, Tuple
 sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
 from minimateplus.waveform_codec import walk_body, find_data_start, WaveformBlock
 # ── Reusable pieces ──────────────────────────────────────────────────────────
 CHANNELS = ("Tran", "Vert", "Long", "MicL")
 LSB_INV = 200  # 1 in/s / 0.005 in/s/LSB; multiply BW-export floats by this
               # to get 16-count units (the body's native quantization).
@dataclass
 class FixtureEvent:
    name: str           # e.g. "M529LL1A.SP0"
    bin_path: str
    txt_path: str
    body: bytes
    truth: dict         # {channel: list of int16-quantized samples}
    blocks: List[WaveformBlock]
    segment_starts: List[int]  # block indices of each 40 02 segment header
    segment_sample_starts: List[int]  # for each segment, the truth sample index it starts at
 def s4(n: int) -> int:
    """4-bit signed nibble decode."""
    return n if n < 8 else n - 16
 def i8(b: int) -> int:
    """int8 reinterpret of unsigned byte."""
    return b if b < 128 else b - 256
 def load_fixture(name: str) -> FixtureEvent:
    """Load a fixture event with its truth values and parsed block stream."""
    # Find the fixture (search both subdirs of tests/fixtures/).
    base = os.path.join(os.path.dirname(__file__), "..", "tests", "fixtures")
    candidates = [
        os.path.join(base, "5-11-26", name),
        os.path.join(base, "decode-re-5-8-26", "event-a", name),  # not used directly
    ]
    bin_path = next((c for c in candidates if os.path.exists(c)), None)
    if bin_path is None:
        # Try a glob walk for the 5-8 fixtures (they're in subdirs).
        for root, _, files in os.walk(base):
            if name in files:
                bin_path = os.path.join(root, name)
                break
    if bin_path is None:
        raise FileNotFoundError(name)
    txt_path = bin_path + ".TXT"
    with open(bin_path, "rb") as f:
        raw = f.read()
    body = raw[43:-26]
    truth = _parse_txt(txt_path)
    blocks = walk_body(body, find_data_start(body))
    seg_idx = [i for i, b in enumerate(blocks) if b.tag_hi == 0x40]
    # Segment 0 starts at sample 0; subsequent segments start at the
    # cumulative sample count from previous segment(s).  Tran's segment 0
    # is N samples; if rotation hypothesis is correct, segment 1's data
    # starts at sample 0 for a *different* channel.  The analyzer should
    # try both "continues from previous segment" and "starts at sample 0
    # of a different channel."
    seg_sample_starts = _compute_segment_sample_starts(blocks, seg_idx)
    return FixtureEvent(
        name=name, bin_path=bin_path, txt_path=txt_path,
        body=body, truth=truth, blocks=blocks,
        segment_starts=seg_idx, segment_sample_starts=seg_sample_starts,
    )
 def _parse_txt(path: str) -> dict:
    """Parse BW ASCII TXT export into {channel: [int_samples_in_16_count_units]}."""
    with open(path, "r", encoding="utf-8", errors="replace") as f:
        lines = f.read().splitlines()
    header_idx = next(
        (i for i, l in enumerate(lines)
         if all(c in l for c in CHANNELS)),
        None,
    )
    if header_idx is None:
        return {ch: [] for ch in CHANNELS}
    out = {ch: [] for ch in CHANNELS}
    for line in lines[header_idx + 1:]:
        parts = re.split(r"\s+", line.strip())
        if len(parts) < 4:
            continue
        try:
            vals = [float(p) for p in parts[:4]]
        except ValueError:
            continue
        for ch, v in zip(CHANNELS, vals):
            # Multiply by LSB_INV; geo channels are in in/s, MicL is in dB(L)
            # (which doesn't quantize the same way — leaving raw for MicL is fine,
            # the scorer should treat MicL specially).
            out[ch].append(round(v * LSB_INV) if ch != "MicL" else v)
    return out
 def _compute_segment_sample_starts(
    blocks: List[WaveformBlock], seg_idx: List[int]
 ) -> List[int]:
    """Cumulative sample-count up to each segment header (if all blocks treated
    as Tran continuation).  Useful as one candidate for segment-1-Tran tests.
    The scorer should ALSO try "segment 1 starts at sample 0 of a new channel"
    as the rotation hypothesis predicts.
    """
    starts = []
    cum = 2  # T[0] + T[1] from preamble
    for i, b in enumerate(blocks):
        if i in seg_idx:
            starts.append(cum)
        if b.tag_hi == 0x10:
            cum += b.tag_lo
        elif b.tag_hi == 0x20:
            cum += b.tag_lo
        elif b.tag_hi == 0x00:
            cum += b.tag_lo
        # 30 NN and 40 02 don't contribute samples (for this hypothesis)
    return starts
 # ── The core algorithm: decode a segment's blocks as deltas ─────────────────
 def decode_segment_as_channel(
    blocks: List[WaveformBlock],
    seg_start_block_idx: int,
    seg_end_block_idx: int,
    anchor: int,
 ) -> List[int]:
    """Apply the segment-0 codec rules to a range of blocks, starting from *anchor*.
    Returns a list of cumulative sample values (one per delta).  Does NOT include
    the anchor itself in the output — the first returned value is anchor + first_delta.
    """
    out = []
    cur = anchor
    for bi in range(seg_start_block_idx, seg_end_block_idx):
        blk = blocks[bi]
        if blk.tag_hi == 0x10:
            for byte in blk.data:
                for nib in ((byte >> 4) & 0xF, byte & 0xF):
                    cur += s4(nib)
                    out.append(cur)
        elif blk.tag_hi == 0x20:
            for byte in blk.data:
                cur += i8(byte)
                out.append(cur)
        elif blk.tag_hi == 0x00:
            for _ in range(blk.tag_lo):
                out.append(cur)
        # 30 NN: skip (content unknown)
        # 40 02: shouldn't appear in segment data (it's the segment header)
    return out
 def score_against_truth(
    decoded: List[int],
    truth: List[int],
    truth_start: int,
 ) -> Tuple[int, int]:
    """Compare *decoded* to truth[truth_start : truth_start + len(decoded)].
    Returns (n_matches, n_compared).
    """
    n = min(len(decoded), len(truth) - truth_start)
    if n <= 0:
        return (0, 0)
    matches = sum(1 for i in range(n) if decoded[i] == truth[truth_start + i])
    return (matches, n)
 # ── TODO for the next pass ──────────────────────────────────────────────────
 def score_segment_against_all_channels(
    event: FixtureEvent,
    segment_index: int,
 ) -> List[Tuple[str, int, int, int]]:
    """For segment *segment_index* of *event*, find the best (channel, start_sample)
    fit.
    For each candidate channel C and each candidate starting truth-sample index s,
    we pick the anchor that makes the FIRST decoded value match truth[C][s], then
    score the remaining decoded values against truth[C][s+1 : s+N].
    Returns rows of (channel_name, start_sample, n_matches, n_compared)
    sorted by match-count descending.
    """
    # Block range of this segment: from the segment header (inclusive) up to
    # the next segment header (exclusive), or end-of-blocks.
    seg_header_idx = event.segment_starts[segment_index]
    next_header_idx = (
        event.segment_starts[segment_index + 1]
        if segment_index + 1 < len(event.segment_starts)
        else len(event.blocks)
    )
    # Decode the segment's data blocks (skip the segment-header block itself).
    # Use anchor=0 — we'll re-anchor when scoring against each channel.
    deltas_trajectory = decode_segment_as_channel(
        event.blocks, seg_header_idx + 1, next_header_idx, anchor=0
    )
    if not deltas_trajectory:
        return []
    n = len(deltas_trajectory)
    results = []
    for ch in ("Tran", "Vert", "Long"):
        truth = event.truth.get(ch)
        if not truth or len(truth) < n + 1:
            continue
        # For each candidate starting sample s in truth, check if applying
        # the deltas starting from truth[s] reproduces truth[s+1:s+n+1].
        best = (0, -1)
        for s in range(len(truth) - n):
            anchor = truth[s]
            offset = anchor - deltas_trajectory[0] + truth[s + 1] - anchor
            # Recompute: trajectory[i] = anchor + cumulative_delta_through_i
            # but we already have deltas_trajectory computed from anchor=0,
            # so trajectory_relative[i] = anchor + deltas_trajectory[i].
            matches = 0
            for i in range(n):
                if truth[s + i + 1] == anchor + deltas_trajectory[i]:
                    matches += 1
                # Note: we could break early on first mismatch for "matches start",
                # but counting total matches gives a more robust score.
            if matches > best[0]:
                best = (matches, s)
        results.append((ch, best[1], best[0], n))
    results.sort(key=lambda r: -r[2])
    return results
 # ── Driver ──────────────────────────────────────────────────────────────────
 def main():
    """Run the analyzer on all loud-bundle events and print best scores."""
    events = ["M529LL1A.SP0", "M529LL1A.SS0", "M529LL1A.SV0",
              "M529LL1L.JQ0", "M529LL1L.V70"]
    for name in events:
        try:
            event = load_fixture(name)
        except FileNotFoundError:
            print(f"{name}: fixture not found")
            continue
        print(f"\n=== {name} ===")
        print(f"  body bytes: {len(event.body)}")
        print(f"  blocks: {len(event.blocks)}")
        print(f"  segments: {len(event.segment_starts)}")
        print(f"  segment sample-starts (if all blocks are 1 channel):")
        for si, sample_start in enumerate(event.segment_sample_starts):
            print(f"    seg {si}: sample {sample_start}")
        for si in range(len(event.segment_starts)):
            results = score_segment_against_all_channels(event, si)
            if not results:
                print(f"  seg {si}: (no scorable data)")
                continue
            tag = "✓" if results[0][2] / max(results[0][3], 1) > 0.9 else " "
            top = results[0]
            print(f"  seg {si}: best fit {tag} = {top[0]:<5} "
                  f"starting at sample {top[1]:>5}, {top[2]:>4}/{top[3]:<4} match"
                  + (f"  (next: {results[1][0]} @{results[1][1]} {results[1][2]}/{results[1][3]})"
                     if len(results) > 1 else ""))
 if __name__ == "__main__":
    main()
@@ -0,0 +1,518 @@
 """
 Tests for minimateplus.waveform_codec — Blastware waveform-file body block walker.
 These tests lock in the STRUCTURAL framing of the body codec.  The byte-to-sample
 mapping is open (see waveform_codec module docstring) — until that's nailed down,
 :func:`decode_waveform_v2` returns ``None`` and there is no per-sample assertion
 to make.
 """
 from __future__ import annotations
 import os
 import pytest
 from minimateplus.waveform_codec import (
    WaveformBlock,
    decode_tran_initial,
    decode_waveform_v2,
    decoded_to_adc_counts,
    find_data_start,
    mic_count_to_db,
    parse_segment_header,
    split_segments,
    walk_body,
 )
 FIXTURES = os.path.join(
    os.path.dirname(__file__), "fixtures", "decode-re-5-8-26"
 )
 def _bw_body(path):
    """Strip the 22-byte header and 21-byte STRT and 26-byte footer to get the body."""
    with open(path, "rb") as f:
        binary = f.read()
    return binary[43:-26]
 # Fixture metadata — bundled BW binaries from a real BE11529 unit, May 8 2026.
 # Each is paired with a Blastware TXT export (the ASCII ground truth).
 FIXTURES_INFO = {
    "event-a": {
        "filename": "M529LKVQ.6S0",
        "n_samples": 3328,    # 3.0 s rectime + 0.25 s pretrig at 1024 sps
        "rectime": 3.0,
    },
    "event-b": {
        "filename": "M529LK5Q.RG0",
        "n_samples": 2304,    # 2.0 s
        "rectime": 2.0,
    },
    "event-c": {
        "filename": "M529LK44.AB0",
        "n_samples": 1280,    # 1.0 s
        "rectime": 1.0,
    },
    "event-d": {
        "filename": "M529LK2V.470",
        "n_samples": 1280,
        "rectime": 1.0,
    },
 }
 def _fixture_path(event_name):
    info = FIXTURES_INFO[event_name]
    return os.path.join(FIXTURES, event_name, info["filename"])
 # ── Find data start ──────────────────────────────────────────────────────────
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_find_data_start_locates_first_block(event_name):
    """The walker auto-detects the first ``10 NN`` tag within the first 20 bytes."""
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    start = find_data_start(body)
    assert 0 <= start < 20, f"expected start in [0, 20), got {start}"
    assert body[start] in (0x00, 0x10, 0x20, 0x30, 0x40), (
        f"first tag byte 0x{body[start]:02x} not a recognized block type"
    )
    assert body[start + 1] % 4 == 0 or (body[start] == 0x40 and body[start + 1] == 0x02)
 def test_find_data_start_canonical_offset_7():
    """All events have a 7-byte preamble (3-byte magic + 4-byte Tran anchors)."""
    for name in FIXTURES_INFO:
        path = _fixture_path(name)
        if not os.path.exists(path):
            pytest.skip(f"fixture missing: {path}")
        body = _bw_body(path)
        # Sanity: magic
        assert body[0:3] == b"\x00\x02\x00", f"{name}: bad magic"
        # First tag at offset 7
        assert find_data_start(body) == 7, f"{name}: expected start=7"
 # ── Block walker ─────────────────────────────────────────────────────────────
 def test_walk_body_empty_returns_empty():
    assert walk_body(b"") == []
 def test_walk_body_invalid_start_returns_empty():
    # Body that does not begin with a recognized tag.
    assert walk_body(b"\xff\xff\xff\xff", start=0) == []
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_walk_body_produces_blocks(event_name):
    """The walker should produce a non-empty stream of blocks for every fixture."""
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    blocks = walk_body(body)
    assert len(blocks) > 0
    # All blocks have one of the known tag families.  ``1X NN`` / ``2X NN``
    # with X in 0..F are valid (X > 0 means wide-NN encoding).
    for b in blocks:
        assert (b.tag_hi & 0xF0) in (0x10, 0x20, 0x00, 0x30, 0x40), (
            f"unknown tag {b.tag_hi:#04x} at offset {b.offset}"
        )
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_walk_body_block_lengths_consistent(event_name):
    """Each block's recorded length matches its on-wire footprint."""
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    blocks = walk_body(body)
    for b in blocks:
        # Tag (2 bytes) + payload should equal length.
        assert 2 + len(b.data) == b.length, (
            f"block at {b.offset} length mismatch: tag(2) + data({len(b.data)}) != length({b.length})"
        )
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_walk_body_blocks_contiguous(event_name):
    """Block n+1 starts exactly where block n ends (no gaps, no overlaps)."""
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    blocks = walk_body(body)
    for i in range(1, len(blocks)):
        prev = blocks[i - 1]
        cur = blocks[i]
        assert cur.offset == prev.offset + prev.length, (
            f"gap/overlap between block {i-1} (off={prev.offset} len={prev.length}) "
            f"and block {i} (off={cur.offset})"
        )
 # ── Segment splitting ────────────────────────────────────────────────────────
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_split_segments_yields_at_least_one(event_name):
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    blocks = walk_body(body)
    segments = split_segments(blocks)
    assert len(segments) > 0
 def test_split_segments_segment_count_at_least_one_per_event():
    """The walker should produce at least one ``40 02`` segment header per event.
    Note: the walker currently bails out partway through event-b (still an
    open issue — the body codec uses block lengths the walker doesn't
    handle correctly past offset ~427).  The other 3 events walk farther
    and have many segment headers.
    """
    for name in FIXTURES_INFO:
        path = _fixture_path(name)
        if not os.path.exists(path):
            continue
        body = _bw_body(path)
        blocks = walk_body(body)
        n_40 = sum(1 for b in blocks if b.tag_hi == 0x40)
        assert n_40 >= 1, f"{name}: no 40 02 segment header found"
 # ── Segment header parsing ───────────────────────────────────────────────────
 def test_parse_segment_header_returns_none_for_non_40():
    block = WaveformBlock(offset=0, tag_hi=0x10, tag_lo=0x04, data=b"\x00\x00", length=4)
    assert parse_segment_header(block) is None
 def test_parse_segment_header_decodes_fields():
    """Decode a known 40 02 block to verify field offsets."""
    # First segment header from event-c at body offset 235:
    # 40 02 00 00 00 00 0a 4b 01 1e 47 00 00 00 02 00 00 01 00 01
    payload = bytes.fromhex("00000000 0a4b011e 47000000 02000001 0001".replace(" ", ""))
    block = WaveformBlock(
        offset=235, tag_hi=0x40, tag_lo=0x02, data=payload, length=20
    )
    decoded = parse_segment_header(block)
    assert decoded is not None
    assert decoded["counter"] == 0x47       # uint32 LE
    assert decoded["fixed_pattern"] == b"\x02\x00\x00\x01"
    assert decoded["anchor_bytes"] == b"\x00\x00\x00\x00"
 def test_segment_counter_increments():
    """The 4-byte counter at bytes [8:12] of each 40 02 payload increments by 1."""
    path = _fixture_path("event-c")
    if not os.path.exists(path):
        pytest.skip("fixture missing")
    body = _bw_body(path)
    blocks = walk_body(body)
    headers = [b for b in blocks if b.tag_hi == 0x40 and b.tag_lo == 0x02]
    counters = [parse_segment_header(b)["counter"] for b in headers]
    assert len(counters) >= 5, "expect at least 5 segments to verify increments"
    # First few counters should be strictly monotonic (the BW counter is global,
    # incrementing across the whole flash buffer; some events may share counter
    # values with the previous event's tail block, so allow non-strict).
    for i in range(1, min(8, len(counters))):
        assert counters[i] >= counters[i - 1], (
            f"counter went backwards: {counters[i-1]} → {counters[i]}"
        )
 # ── decode_waveform_v2: currently a stub ─────────────────────────────────────
@pytest.mark.parametrize("event_name", list(FIXTURES_INFO.keys()))
 def test_decode_waveform_v2_returns_dict(event_name):
    """decode_waveform_v2 returns a dict with all 4 channels (verified 2026-05-11)."""
    path = _fixture_path(event_name)
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    body = _bw_body(path)
    result = decode_waveform_v2(body)
    assert result is not None
    assert set(result.keys()) == {"Tran", "Vert", "Long", "MicL"}
 # Multi-channel ground-truth fixtures.  Each row: (path, channel, n_to_verify).
 # These lock in the channel-rotation hypothesis: segments cycle T → V → L → M,
 # with each segment header carrying a 2-sample anchor pair (bytes [14:18])
 # for THIS segment's channel plus 2 continuation deltas (bytes [0:4]) for
 # the PREVIOUS channel.
 MULTICHANNEL_FIXTURES = [
    # ALL geo channels fully decoded for every event in the bundle:
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Tran", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Vert", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"), "Long", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Tran", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Vert", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"), "Long", 3328),
    # SP0 (loud all-channels): NOW fully decodes after the wide-NN walker fix.
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Tran", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Vert", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"), "Long", 3328),
    # SS0 / SV0 (loud-from-start): walker now reaches 3072–3078 samples per
    # channel (out of 3079 total).  A few tail samples still missing.
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Tran", 3078),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Vert", 3072),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"), "Long", 3072),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Tran", 3078),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Vert", 3072),
    (os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"), "Long", 3072),
    # 5-8-26 quiet bundle: events without 30 NN blocks decode FULLY across all channels.
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-a", "M529LKVQ.6S0"), "Tran", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-a", "M529LKVQ.6S0"), "Vert", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-a", "M529LKVQ.6S0"), "Long", 3328),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-c", "M529LK44.AB0"), "Tran", 1280),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-c", "M529LK44.AB0"), "Vert", 1280),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-c", "M529LK44.AB0"), "Long", 1280),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-d", "M529LK2V.470"), "Tran", 1280),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-d", "M529LK2V.470"), "Vert", 1280),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-d", "M529LK2V.470"), "Long", 1280),
    # event-b: 2304 samples × 3 — now fully decodes (was the historical
    # walker-stop case; fixed by wide-NN tag support).
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-b", "M529LK5Q.RG0"), "Tran", 2304),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-b", "M529LK5Q.RG0"), "Vert", 2304),
    (os.path.join(os.path.dirname(__file__), "fixtures", "decode-re-5-8-26",
                  "event-b", "M529LK5Q.RG0"), "Long", 2304),
 ]
@pytest.mark.parametrize("path,channel,n", MULTICHANNEL_FIXTURES)
 def test_decode_waveform_v2_channels_match_truth(path, channel, n):
    """Decoded channels match the BW ASCII export byte-exact for the verified ranges."""
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    with open(path, "rb") as f:
        body = f.read()[43:-26]
    truth = _full_truth_channel(path, channel)
    decoded = decode_waveform_v2(body)
    assert decoded is not None
    pred = decoded[channel]
    assert len(pred) >= n, f"only {len(pred)} samples decoded, expected ≥ {n}"
    for i in range(n):
        assert pred[i] == truth[i], (
            f"{os.path.basename(path)} {channel}[{i}]: pred={pred[i]} truth={truth[i]}"
        )
 # ── decode_tran_initial: confirmed correct against ground truth ──────────────
 # Bundled fixtures for the high-amplitude 5-11-26 events (PPV ~6-7 in/s).
 # These cracked the Tran codec — see waveform_codec module docstring.
 TRAN_INITIAL_FIXTURES = [
    # (path, expected first N Tran samples in 16-count units, # of samples to verify)
    (
        os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SP0"),
        [4, 4, 3, 3, 3, 2, 2, 3, 2, 2, 2, 2, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0],
        22,
    ),
    (
        os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SS0"),
        [-89, -89, -91, -91, -92, -93, -94, -94, -94, -94],
        42,
    ),
    (
        os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1A.SV0"),
        [-745, -762, -771, -774, -779, -794, -808, -811, -811, -819],
        46,
    ),
    # Vert-heavy event (T near zero) — segment 0 = 510 samples, all decode correctly.
    (
        os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.JQ0"),
        [0] * 4 + [-1, 0, 0, -1, -1, 0],
        38,
    ),
    # Mic-heavy event (geos all near zero) — segment 0 = 482 samples.
    (
        os.path.join(os.path.dirname(__file__), "fixtures", "5-11-26", "M529LL1L.V70"),
        [0] * 10,
        6,
    ),
 ]
 def _full_truth(path):
    """Load Tran samples (in 16-count units) from the BW ASCII export."""
    return _full_truth_channel(path, "Tran")
 def _full_truth_channel(path, channel):
    """Load one channel's samples (in 16-count units) from the BW ASCII export."""
    import glob, re
    col_idx = {"Tran": 0, "Vert": 1, "Long": 2, "MicL": 3}[channel]
    # event-a's TXT has a typo ("M59" vs "M529") — pick the .TXT in the same dir
    # rather than assuming exact-name correspondence.
    txt_path = path + ".TXT"
    if not os.path.exists(txt_path):
        candidates = glob.glob(os.path.join(os.path.dirname(path), "*.TXT"))
        if candidates:
            txt_path = candidates[0]
    with open(txt_path, "r", encoding="utf-8", errors="replace") as f:
        lines = f.read().splitlines()
    header_idx = None
    for i, line in enumerate(lines):
        if "Tran" in line and "Vert" in line and "Long" in line and "MicL" in line:
            header_idx = i
            break
    if header_idx is None:
        return None
    out = []
    for line in lines[header_idx + 1:]:
        parts = re.split(r"\s+", line.strip())
        if len(parts) < 4:
            continue
        try:
            out.append(round(float(parts[col_idx]) * 200))
        except ValueError:
            continue
    return out
@pytest.mark.parametrize("path,expected,n_required", TRAN_INITIAL_FIXTURES)
 def test_decode_tran_initial_matches_ground_truth(path, expected, n_required):
    """The Tran initial decoder produces values matching the BW ASCII export exactly."""
    if not os.path.exists(path):
        pytest.skip(f"fixture missing: {path}")
    with open(path, "rb") as f:
        raw = f.read()
    body = raw[43:-26]
    decoded = decode_tran_initial(body)
    assert decoded is not None
    # Check first len(expected) samples match exactly.
    for i in range(len(expected)):
        assert decoded[i] == expected[i], (
            f"sample {i}: decoded={decoded[i]} expected={expected[i]}"
        )
    # And we got at least n_required samples decoded.
    assert len(decoded) >= n_required, (
        f"decoded only {len(decoded)} samples, expected at least {n_required}"
    )
 def test_decode_tran_initial_handles_empty():
    assert decode_tran_initial(b"") is None
    assert decode_tran_initial(b"not a body") is None
 def test_decode_tran_initial_synthetic_body():
    """A synthetic body with preamble + one 10 04 block decodes correctly."""
    # Magic + T[0]=10 + T[1]=20 in 16-count units.
    # Then 10 04 block with 4 nibbles: (+1, -1, +2, -2)
    # Encoded high-nibble first: 0x1F = (1, -1), 0x2E = (2, -2)
    body = b"\x00\x02\x00\x00\x0a\x00\x14" + b"\x10\x04" + b"\x1f\x2e"
    decoded = decode_tran_initial(body)
    # T[0]=10, T[1]=20, then deltas (+1, -1, +2, -2) from T[1]=20
    assert decoded == [10, 20, 21, 20, 22, 20]
 def test_decode_tran_initial_with_rle():
    """A synthetic body with 00 NN RLE block runs the current Tran value forward."""
    # T[0]=5, T[1]=5, then 00 08 RLE block = 8 zero deltas → T[2..9] = 5
    body = b"\x00\x02\x00\x00\x05\x00\x05" + b"\x00\x08"
    decoded = decode_tran_initial(body)
    assert decoded == [5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
 def test_decode_tran_initial_full_segment_silent_events():
    """For events with near-silent Tran, segment 0 (~482-510 samples) decodes fully."""
    for path, _, _ in TRAN_INITIAL_FIXTURES[3:]:  # JQ0 (Vert-heavy) and V70 (Mic-heavy)
        if not os.path.exists(path):
            pytest.skip(f"fixture missing: {path}")
        with open(path, "rb") as f:
            body = f.read()[43:-26]
        truth = _full_truth(path)
        decoded = decode_tran_initial(body)
        assert decoded is not None
        # The decoder should produce a clean run of samples; check ALL of them
        # match truth (segment 0 is fully solved for events where T is near zero).
        n = len(decoded)
        for i in range(n):
            assert decoded[i] == truth[i], (
                f"{os.path.basename(path)}: sample {i}: decoded={decoded[i]} truth={truth[i]}"
            )
        # And we should have decoded at least 400 samples (= segment 0 worth).
        assert n >= 400, f"only {n} samples decoded for {path}"
 # ── ADC scaling + dB conversion ──────────────────────────────────────────────
 def test_decoded_to_adc_counts_geo_scales_by_16():
    """Geo channels in decoder units (16-count) should multiply by 16 to ADC."""
    decoded = {"Tran": [0, 1, -2, 100], "Vert": [5], "Long": [-10], "MicL": [813]}
    adc = decoded_to_adc_counts(decoded)
    assert adc["Tran"] == [0, 16, -32, 1600]
    assert adc["Vert"] == [80]
    assert adc["Long"] == [-160]
    # Mic passes through unchanged (already ADC counts).
    assert adc["MicL"] == [813]
 def test_decoded_to_adc_counts_empty():
    assert decoded_to_adc_counts({}) == {}
    assert decoded_to_adc_counts(
        {"Tran": [], "Vert": [], "Long": [], "MicL": []}
    ) == {"Tran": [], "Vert": [], "Long": [], "MicL": []}
 def test_mic_count_to_db_zero_is_zero():
    assert mic_count_to_db(0) == 0.0
 def test_mic_count_to_db_unit_is_reference():
    """count = ±1 → ±81.94 dB (the calibration reference)."""
    assert abs(mic_count_to_db(1) - 81.94) < 0.01
    assert abs(mic_count_to_db(-1) - (-81.94)) < 0.01
 def test_mic_count_to_db_doubles_every_6db():
    """Each doubling of |count| adds ~6.02 dB."""
    # count=2 → 87.96 dB (+ 6.02 from 81.94)
    assert abs(mic_count_to_db(2) - 87.96) < 0.05
    # count=4 → 93.98 dB
    assert abs(mic_count_to_db(4) - 93.98) < 0.05
    # count=8 → 100.00 dB
    assert abs(mic_count_to_db(8) - 100.00) < 0.05
 def test_mic_count_to_db_v70_peak():
    """V70 mic peak count 813 → 140.14 dB (matches BW reported PSPL 140.1)."""
    assert abs(mic_count_to_db(813) - 140.14) < 0.1
    # And the negative-direction equivalent
    assert abs(mic_count_to_db(-813) - (-140.14)) < 0.1
 # ── End-to-end: decode_a5_frames (production entry point) ───────────────────
 def test_decode_a5_frames_empty():
    from minimateplus.waveform_codec import decode_a5_frames
    assert decode_a5_frames([]) is None
    assert decode_a5_frames(None) is None
Author	SHA1	Message	Date
serversdown	d85df4c886	Merge pull request 'merge full s3 codec decoded' (#23 ) from codec-re into main Reviewed-on: #23	2026-05-20 13:45:32 -04:00
Claude	0466bb4f44	codec: crack wide-NN blocks (1X NN / 2X NN); loud events now fully decode When NN exceeds 0xFC, the codec extends to 12-bit NN by using the low nibble of the TYPE byte as the high nibble of NN: 1X NN → nibble-delta block, NN = (X << 8) \| NN_byte 2X NN → int8-delta block, same NN encoding Walker and decode_waveform_v2 now handle both narrow (X=0) and wide (X != 0) forms uniformly. Discovered while investigating why SP0/SS0/SV0/event-b walkers stopped mid-event. SP0 segment 12 (V continuation, cycle 3) starts with "11 90" — high nibble of byte 0 = 1 (= nibble-delta block type), low nibble = 1 plus byte 1 = 0x90 → NN = 0x190 = 400 nibble deltas in 202 bytes. Walker was rejecting "11" as a non-tag. Sample count went from 47,364 to 72,972 verified byte-exact: event-a: 9984 (full) was 9984 (full) event-b: 6912 (full) was 738 event-c: 3840 (full) was 3840 (full) event-d: 3840 (full) was 3840 (full) JQ0: 9984 (full) was 9984 (full) V70: 9984 (full) was 9984 (full) SP0: 9984 (full) was 5122 SS0: 9222 (-7 tail) was 1758 SV0: 9222 (-7 tail) was 2114 7 of 9 fixtures now decode end-to-end across all 3 geo channels. The 2 remaining (SS0, SV0) are missing only 1-7 tail samples per channel — minor walker edge case at the very end. 74 tests pass (was 71).	2026-05-20 17:28:54 +00:00
Claude	85f4bcfe86	codec: wire decode_waveform_v2 into production; add MicL dB helper Replaces the broken legacy int16 LE decoder in client.py with the verified multi-channel codec. Three changes: 1. blastware_file.extract_body_bytes(a5_frames) — new helper that factors out the body-reconstruction logic from write_blastware_file so both writers (BW binary) and decoders (sample arrays) can use the same canonical bytes. 2. waveform_codec.decode_a5_frames(a5_frames) — production entry point. Returns the raw_samples dict consumers expect (Tran/Vert/Long as int16 ADC counts; MicL as native ADC counts). Internally: A5 frames → extract_body_bytes → decode_waveform_v2 → decoded_to_adc_counts (geos ×16; mic pass-through) 3. waveform_codec.mic_count_to_db(count) — MicL ADC → dB(L) per BW's display formula: dB = sign(count) × (81.94 + 20 × log10(\|count\|)) for \|count\| ≥ 1 Verified against V70 fixture: count=813 → 140.14 dB (BW PSPL 140.1). client.py:_decode_a5_waveform is reduced to a thin wrapper that calls decode_a5_frames and populates event.raw_samples. Original implementation preserved as _decode_a5_waveform_LEGACY (dead code; reference only). Also fixed a tail-end bug in decode_waveform_v2 where trailer-section "40 02" markers (containing ASCII serial bytes, NOT real segment headers) were being mis-interpreted, producing 2 spurious samples per channel at the end of each event. Added bytes [12:14] == "02 00" validation to reject non-header markers. 7 new pytest tests cover the new helpers and dB conversion. Total: 71 passing (up from 64). Known limitation (carried over from before): the walker still stops mid-event on the loudest fixtures (SP0/SS0/SV0/event-b) at some mid-segment edge cases not yet characterized. Every sample reached is decoded correctly; the walker just doesn't reach all of them. Loud events still yield 5,000–15,000 byte-exact samples each.	2026-05-20 17:28:54 +00:00
Claude	2ff2762eec	codec-re: 30 NN block CRACKED — codec fully decoded User intuition (16-bit) + 12-bit packing hypothesis + the int16 ADC range constraint led to the final piece. 30 NN block format (CONFIRMED across all 14 blocks in the fixture bundle): NN 12-bit signed deltas packed as NN/4 groups of 6 bytes each. Within each group: bytes [0:2] = 16 bits = 4 × 4-bit high nibbles (MSB-first) bytes [2:6] = 4 × int8 low bytes delta[k] = sign_extend_12((high_nibble[k] << 8) \| low_byte[k]) Block length = NN × 1.5 + 2 bytes (tag included). Earlier walker used NN × 4 which is only correct in the TRAILER section. Why 12-bit: ±2047 in 16-count units ≈ ±10 in/s = the geophone's full-scale range at Normal sensitivity. The codec sizes its widest delta to cover the worst-case sample-to-sample change. Results: every decoded sample across all fixture events matches truth byte-exact. ZERO divergences. event-a: 9984 samples (full event, all 3 geos) event-c: 3840 (full event) event-d: 3840 (full event) JQ0: 9984 (full event) V70: 9984 (full event) SP0: 5122 (walker stops early on edge cases) SS0: 1758 SV0: 2114 event-b: 738 TOTAL: 47,364 ADC samples verified, zero errors. Three full 3-sec events decode end-to-end across all three geo channels. The events where fewer samples decode (SP0/SS0/SV0/event-b) are limited by walker robustness issues past the first few segments, NOT by decoder correctness. 64 tests pass (up from 55). Files: minimateplus/waveform_codec.py (new 30 NN decode + corrected walker length), tests/test_waveform_codec.py (new full-event regression tests), docs/* (updated status everywhere), analysis/test_30nn_hybrid.py (new — the analysis script that confirmed the format).	2026-05-20 17:28:54 +00:00
Claude	d4cdce77fa	codec-re: 30 NN partial finding — sum matches but per-sample distribution doesn't Tested the 12-bit signed packed delta hypothesis (motivated by the observation that ±2047 in 16-count units ≈ ±32K raw ADC counts, almost exactly the int16 ADC range — a strong design hint). Result: mixed. For SP0 block @1689 (V seg 4, samples 650..653): truth deltas: 47, 297, 384, 61 (sum = 789) 12-bit BE contiguous pred: 17, 47, 664, 61 (sum = 789) Positions 1 and 3 of the pred match truth values at positions 0 and 3 exactly, AND the total sum across all 4 positions matches. But positions 0 and 2 of pred don't match any truth value. Hypothesis space narrows to: - 12-bit deltas WITH a specific re-ordering or interleaving - 12-bit deltas with one of the positions being a "step size" or "checksum-like" repacked value - A nonlinear / coded format where the underlying total displacement is preserved but per-sample distribution is encoded differently Two analysis scripts committed (test_30nn_12bit.py, test_30nn_v2.py). The v2 script uses a real-decoder simulation to get the exact channel + sample-index for each 30 NN block, eliminating off-by-one errors in the truth lookup.	2026-05-20 17:28:54 +00:00
Claude	ce5dc640ba	codec-re: quiet bundle decodes FULLY (17k samples, zero errors) User asked the right question: do events without 30 NN blocks decode fully? Answer: YES. event-a: Tran 3328 ✓ Vert 3328 ✓ Long 3328 ✓ (28 segments, 0 '30 NN') event-c: Tran 1280 ✓ Vert 1280 ✓ Long 1280 ✓ (12 segments, 0 '30 NN') event-d: Tran 1280 ✓ Vert 1280 ✓ Long 1280 ✓ (12 segments, 0 '30 NN') 17,664 ADC samples decoded byte-exact against BW's ASCII export. Zero divergences across event-a, event-c, event-d. This means the codec is FULLY SOLVED for any event without 30 NN blocks. The remaining gap is the 30 NN block format only — used for high-amplitude regions where deltas exceed int8 range. For quiet events (or quiet stretches of loud events), the decoder is complete. 9 new regression tests bring the total to 55, all passing. Files: tests/test_waveform_codec.py + docs/waveform_codec_re_status.md + new analysis/verify_quiet_bundle.py.	2026-05-20 17:28:54 +00:00
Claude	07675626dc	codec-re: channel rotation CONFIRMED — full multi-channel decoder works The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py) ran and immediately confirmed the rotation hypothesis: SP0 seg 0: best fit Vert 508/508 ✓ SP0 seg 1: best fit Long 508/508 ✓ SP0 seg 3: best fit Tran 508/508 ✓ (Tran continuation) SP0 seg 5: best fit Long 508/508 ✓ SP0 seg 9: best fit Long 508/508 ✓ V70 seg 0: best fit Vert 508/508 ✓ V70 seg 1: best fit Long 508/508 ✓ Channels rotate Tran → Vert → Long → MicL per 40 02 segment header. Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor the NEW segment's channel (2 samples as int16 BE in 16-count units), AND bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as int16 BE). This is the same "2 anchors + delta stream" structure as the body preamble for Tran. decode_waveform_v2 now returns full per-channel sample dicts. Byte-exact verified ranges: V70: Tran 512, Vert 512, Long 512 (all first segments) JQ0: Tran 512, Vert 258 SP0: Long 1536 (all 3 L segments) Still open: the 30 NN block format (high-amplitude packed deltas) — appears mid-segment when single-byte deltas can't carry the magnitude. 6 new tests bring the count to 46. All passing.	2026-05-20 17:28:54 +00:00
Claude	ae0e17b5dc	codec-re: handoff polish — readmes, skeleton, remove decode-re/ duplicate Three things to make pickup smoother: 1. analysis/README.md (NEW): catalogues the ~25 scratch scripts. Categorizes them as "still useful" / "superseded — keep for archaeology" / "pure exploration". Tells a fresh engineer which files to read first and which to ignore. 2. scratch/next_experiment_skeleton.py (NEW): stub + spec for the segment-channel scoring analyzer. Includes the fixture loader, block walker, and decode-segment-as-channel helper — just enough scaffolding that the next pass starts from "fill in score_segment_against_all_channels()" rather than from scratch. Already runs and confirms 13 segments per 3-sec event with sample starts going to 6590 (way past the 3328 actual samples) — strong evidence that not all segments carry Tran. 3. Removed decode-re/ duplicate. It was a mirror of tests/fixtures/. Analysis scripts that hardcoded decode-re/ paths updated to point at tests/fixtures/. CLAUDE.md note updated: future event uploads go directly into a dated subdirectory under tests/fixtures/. All 40 tests still pass. Skeleton runs.	2026-05-20 17:28:54 +00:00
Claude	f68ee9f0f9	docs: clean up waveform-codec doc layers per review Three "truth layers" had drifted apart between commits. Fixed: 1. waveform_codec.py docstring rewritten from the 2026-05-08 "structural framing only" state to the 2026-05-11 "Tran segment 0 solved + segment-header partially decoded" state. Killed stale "~80 sample-sets per segment" language (real segments are flash-page-byte-sized, not sample-count-sized; observed first-segment sizes are 42-510 samples depending on signal). Killed stale "preamble is 7 or 9 bytes" language (always 7). 2. docs/instantel_protocol_reference.md §7.6.1: added a clear "CURRENT STATUS" box at the top with a status table. Replaced the stale "~80 sample-sets" line with the verified per-event segment sizes. Merged two redundant segment-header field-table sections. 3. docs/waveform_codec_re_status.md (NEW): clean working-status doc. Solved / not solved / hypothesis / next experiment / fixtures / tests. The protocol reference remains the historical Rosetta Stone; this new file is the current-truth working note that shouldn't accumulate fossil layers. 4. CLAUDE.md §"Waveform body codec": prominent warning box at top — "DO NOT TRUST decoded sample arrays yet." BW binary passthrough is the only sample-bearing output to trust until the decoder lands. Added a "Next experiment" subsection pointing the next pass at the segment-channel scoring analyzer. 40 tests still pass.	2026-05-20 17:28:54 +00:00
Claude	5bf5329369	codec-re: add Waveform body codec section to CLAUDE.md Mirrors the structural findings now documented in docs/instantel_protocol_reference.md §7.6.1: block framing solved, Tran segment-0 decode verified across 5 fixture events, multi-segment continuation still open. Also adds waveform_codec.py to the project layout map.	2026-05-20 17:28:54 +00:00
Claude	9ed6f2a8d8	codec-re: add segment 1 block dumper for analysis Investigated multi-segment Tran continuation but couldn't crack it. Each hypothesis tried (segment header consumes 0/1/2 T deltas, blocks continue Tran with various interpretations) breaks at sample ~512. Block budget for V70 segment 1: 264 nibbles + 244 RLE zeros = 508 deltas — exactly the segment size. So the block structure CAN encode 508 single-channel samples, but applying segment 1 blocks as Tran gives wrong values. Most likely the channel ordering changes in segment 1+ (e.g., segment 0 = Tran, segment 1 = Vert, segment 2 = Long, etc.) but I couldn't verify cleanly. Stopping here — segment-0 Tran decode is solid and multi-segment work needs more fresh thinking.	2026-05-20 17:28:54 +00:00
Claude	a0c9a482c7	codec-re: 00 NN is RLE; full Tran segment-0 decode (4 of 5 events) User uploaded a Vert-heavy event (JQ0) and a Mic-heavy event (V70). Those two were exactly what was needed to crack the next piece: - 00 NN block = run-length-encoded zero deltas in the current channel. Append NN copies of the current cumulative value (no change). - find_data_start now recognizes 00 NN as a valid first tag (some events begin with a leading 00 NN RLE block). - decode_tran_initial now decodes the FULL segment 0 (not just the first data block). Results across 5 fixture events: - M529LL1A.SP0 (loud-all-channels) : 510 / 510 ✓ - M529LL1L.JQ0 (Vert-heavy) : 510 / 510 ✓ - M529LL1L.V70 (Mic-heavy) : 510 / 510 ✓ - M529LL1A.SV0 (loud-from-start) : 58 / 58 ✓ - M529LL1A.SS0 (loud-from-start) : 42 / 502 (stops at first 30 04) The 30 04 block (only seen in loud-from-start events) hasn't been decoded yet — likely a channel-switch marker for the high-amplitude regime. Also discovered: segment header (40 02) payload bytes [0:2] = T_delta at first sample of new segment, [6:8] = byte length to next segment. Multi-segment Tran decoding still diverges after sample 512 because the per-segment channel ordering after the header is unknown. Tests: 40 pass (up from 36). Files: - minimateplus/waveform_codec.py: find_data_start fix, RLE handling, full segment-0 decode in decode_tran_initial - tests/test_waveform_codec.py: synthetic RLE test, full segment 0 tests for JQ0 and V70 - tests/fixtures/5-11-26/: M529LL1L.JQ0, M529LL1L.V70 + TXT exports - docs/instantel_protocol_reference.md §7.6.1: RLE + segment-header docs	2026-05-20 17:28:54 +00:00
Claude	6ac126e05c	codec-re: crack Tran channel codec with high-amplitude May 11 bundle User uploaded 3 high-amplitude events (PPV 6-7 in/s — shook the geophone hard) to decode-re/5-11-26/. These cracked the Tran codec: - Preamble bytes [3:5] and [5:7] = Tran[0] and Tran[1] as int16 BE in 16-count units (LSB = 0.005 in/s). Confirmed across all 7 fixtures. - First data block carries Tran deltas from sample 2 onward: * 10 NN block: NN/2 bytes of payload, each byte = two 4-bit signed nibble deltas (high nibble first) * 20 NN block: NN int8 signed deltas Verified 22+42+46 = 110 Tran samples across SP0/SS0/SV0 with 0 errors against BW's ASCII export. Why the earlier 96-combination brute force failed: the quiet 5-8 events all had T[0] = T[1] ≈ 0 so the preamble's per-channel encoding was undetectable. Loud events made the encoding obvious. What's solved: - minimateplus.waveform_codec.decode_tran_initial: returns first N Tran samples in 16-count units for any body. - Walker length formula for in-data 30 NN blocks (NN2 instead of NN4). - Walker now handles bodies that start with 20 NN (in addition to 10 NN). What's still open: - Tran past the first data block (multi-block channel switching). - Vert / Long / MicL channel encodings. - Walker correctness past offset ~427 in event-b. Tests: 36 pass. decode_waveform_v2 still returns None — the full multi-channel decoder is not wired up. decode_tran_initial is the new verified entry point. Files: minimateplus/waveform_codec.py, tests/test_waveform_codec.py (adds 5-11-26 fixtures + decode_tran_initial tests), and docs/instantel_protocol_reference.md §7.6.1 (Tran codec spec).	2026-05-20 17:28:54 +00:00
Claude	d3f77d1d96	codec-re: solve waveform body block framing; per-byte sample mapping still open Decoded the structural framing of the Blastware waveform body — the bytes between the 21-byte STRT record and the 26-byte file footer. The body is a sequence of tagged variable-length blocks, NOT raw int16 LE. Five tag types (10/20/00/30/40 NN) and their lengths are now confirmed against the 4-event May 2026 fixture bundle. Body splits cleanly into ~16 segments (for a 1280-sample event) separated by 40 02 segment headers carrying a monotonically incrementing uint32 LE counter at bytes [8:12]. What's done: - minimateplus/waveform_codec.py — block walker, segment splitter, segment header parser. decode_waveform_v2 is a stub returning None until the byte-to-sample mapping is solved; client.py is unchanged. - tests/test_waveform_codec.py — 31 tests covering block detection, lengths, contiguous-walk, segment splitting, segment-header parsing, and counter monotonicity. All pass. - tests/fixtures/decode-re-5-8-26/ — bundled fixtures (4 events, BW binary + Blastware ASCII export each). - docs/instantel_protocol_reference.md §7.6.1 — replaced retraction box with the verified structural decoding plus an explicit list of what's still open. What's still open: the per-byte mapping inside 10 NN / 20 NN blocks. 96 channel-permutation × nibble-order × sign-convention combinations were brute-force tested; none match BW's ASCII export to within ±1 ADC count. The codec is more elaborate than uniform 4-bit deltas — likely a hybrid variable-bit-width scheme with segment-anchor resync points. Next recommended step: capture an event with a known calibration tone to pin down magnitude scaling. Walker also bails out partway through event-b (open issue documented in both the module and the protocol reference).	2026-05-20 17:28:54 +00:00
serversdown	7bd0f8badf	Pull in v0.18 - Merge branch 'main' into codec-re	2026-05-20 16:50:03 +00:00
serversdown	f7c5c9fed3	Merge branch 'main' into codec-re	2026-05-17 23:30:29 +00:00
serversdown	84ee68f889	Merge branch 'main' into codec-re	2026-05-11 22:27:25 -04:00
serversdown	20519383fe	add additional events for decode	2026-05-11 18:13:24 -04:00
serversdown	3402b4d11a	add additional events for decode-RE	2026-05-11 14:17:21 -04:00