codec-re: channel rotation CONFIRMED — full multi-channel decoder works

The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py)
ran and immediately confirmed the rotation hypothesis:

  SP0 seg 0: best fit Vert  508/508  ✓
  SP0 seg 1: best fit Long  508/508  ✓
  SP0 seg 3: best fit Tran  508/508  ✓  (Tran continuation)
  SP0 seg 5: best fit Long  508/508  ✓
  SP0 seg 9: best fit Long  508/508  ✓
  V70 seg 0: best fit Vert  508/508  ✓
  V70 seg 1: best fit Long  508/508  ✓

Channels rotate Tran → Vert → Long → MicL per 40 02 segment header.

Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor
the NEW segment's channel (2 samples as int16 BE in 16-count units), AND
bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as
int16 BE).  This is the same "2 anchors + delta stream" structure as the
body preamble for Tran.

decode_waveform_v2 now returns full per-channel sample dicts.
Byte-exact verified ranges:
  V70: Tran 512, Vert 512, Long 512   (all first segments)
  JQ0: Tran 512, Vert 258
  SP0: Long 1536 (all 3 L segments)

Still open: the 30 NN block format (high-amplitude packed deltas) —
appears mid-segment when single-byte deltas can't carry the magnitude.

6 new tests bring the count to 46.  All passing.
This commit is contained in:
Claude
2026-05-12 03:57:38 +00:00
committed by serversdown
parent ae0e17b5dc
commit 07675626dc
6 changed files with 365 additions and 136 deletions
+34 -29
View File
@@ -86,44 +86,49 @@ is actually a tagged-block stream with a custom delta+RLE codec.
- **Block framing** — 5 tag types (`10 NN`, `20 NN`, `00 NN`, `30 NN`,
`40 02`) with confirmed lengths. Implementation: `walk_body()` in
`minimateplus/waveform_codec.py`.
- **Tran channel segment 0** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
- **Per-channel codec** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
as int16 BE in **16-count units** (LSB = 0.005 in/s). Then `10 NN`
(4-bit nibble deltas), `20 NN` (int8 deltas), and `00 NN` (RLE zero
deltas) carry Tran deltas from sample 2 onward. Verified byte-perfect
across 4 of 5 fixture events (510 samples each). Implementation:
`decode_tran_initial()`.
- **Segment header** — `40 02` is a 20-byte block. Payload bytes [0:2]
are the T_delta at the start of the new segment (int16 BE). Bytes
[6:8] are the byte length to the next segment header. Bytes [8:12]
are a monotonic uint32 LE counter. Bytes [12:14] are constant `02 00`.
deltas) carry per-channel deltas from sample 2 onward.
- **Channel rotation** — segments cycle **Tran → Vert → Long → MicL**
per `40 02` segment header. Each segment carries ~512 sample-sets of
ONE channel. The initial body (before the first `40 02`) is the
implicit Tran segment.
- **Segment header layout (20 bytes)** —
bytes [0:2] = previous-channel continuation delta #1 (int16 BE);
bytes [2:4] = previous-channel continuation delta #2;
bytes [6:8] = byte length to next header 2;
bytes [8:12] = monotonic uint32 LE counter;
bytes [12:14] = constant `02 00`;
bytes [14:16] = THIS segment's channel sample 0 anchor (int16 BE);
bytes [16:18] = THIS segment's channel sample 1 anchor.
- **`decode_waveform_v2()`** returns full per-channel sample dicts.
Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
### What's NOT solved
- **Tran past segment 0** — multi-segment Tran continuation has been
attempted but every hypothesis tested breaks at sample ~512. Likely
channels rotate across segments (e.g. segment 0 = Tran, segment 1 = Vert,
…) but this is unverified.
- **Vert / Long / Mic channels** — no per-channel decoder yet. These
almost certainly live in later segments but the segment-to-channel
mapping is open.
- **The `30 NN` block content** — appears in loud-from-start events
(SS0, SV0) and breaks the simple Tran walk there. Probably a channel-
switch or alternative-encoding marker for high-amplitude regions.
- **The `30 NN` block content** — these blocks appear in high-amplitude
regions where sample-set deltas exceed what int8 in `20 NN` can
express. Probably a packed multi-byte delta format. Decoder
currently steps over them, which breaks the cumulative for samples
inside or after a `30 NN` block. See
`docs/waveform_codec_re_status.md` for the analysis so far.
- **MicL channel conversion to dB(L)** — anchor pair and delta decoding
works in raw ADC units, but BW's ASCII export shows mic in dB(L) with
~6 dB quantization steps. Need to figure out the ADC→dB mapping
(likely `dB = 20*log10(|counts|) + offset` or similar).
### Next experiment
**Don't hero-code the full decoder.** Build a small analysis tool — a
segment-channel scoring analyzer. For each segment of each fixture
event, run the segment-0 Tran block-walk + RLE decode and score the
cumulative trajectory against the BW ASCII truth for each of {Tran,
Vert, Long, MicL} over that segment's sample range, trying different
anchor-bytes candidates from the segment header. The winning
(channel, anchor-location) combination for each segment reveals
whether segments rotate channels and which header bytes encode the
per-segment channel anchors.
The segment-channel scoring analyzer already ran and confirmed the
channel-rotation hypothesis. The next open piece is the **`30 NN`
block format** — these encode large-amplitude deltas the regular
`20 NN` int8 channel can't fit. Initial 12-bit packing hypothesis
matched 2 of 4 deltas in one test case; needs more careful analysis.
See `docs/waveform_codec_re_status.md` for the full specification of
the next experiment.
See `docs/waveform_codec_re_status.md` for the data and current
guesses.
### Production-code status