codec-re: channel rotation CONFIRMED — full multi-channel decoder works
The segment-channel scoring analyzer (from scratch/next_experiment_skeleton.py) ran and immediately confirmed the rotation hypothesis: SP0 seg 0: best fit Vert 508/508 ✓ SP0 seg 1: best fit Long 508/508 ✓ SP0 seg 3: best fit Tran 508/508 ✓ (Tran continuation) SP0 seg 5: best fit Long 508/508 ✓ SP0 seg 9: best fit Long 508/508 ✓ V70 seg 0: best fit Vert 508/508 ✓ V70 seg 1: best fit Long 508/508 ✓ Channels rotate Tran → Vert → Long → MicL per 40 02 segment header. Also discovered the segment header has DOUBLE duty: bytes [14:18] anchor the NEW segment's channel (2 samples as int16 BE in 16-count units), AND bytes [0:4] extend the PREVIOUS channel by 2 more samples (2 deltas as int16 BE). This is the same "2 anchors + delta stream" structure as the body preamble for Tran. decode_waveform_v2 now returns full per-channel sample dicts. Byte-exact verified ranges: V70: Tran 512, Vert 512, Long 512 (all first segments) JQ0: Tran 512, Vert 258 SP0: Long 1536 (all 3 L segments) Still open: the 30 NN block format (high-amplitude packed deltas) — appears mid-segment when single-byte deltas can't carry the magnitude. 6 new tests bring the count to 46. All passing.
This commit is contained in:
@@ -86,44 +86,49 @@ is actually a tagged-block stream with a custom delta+RLE codec.
|
||||
- **Block framing** — 5 tag types (`10 NN`, `20 NN`, `00 NN`, `30 NN`,
|
||||
`40 02`) with confirmed lengths. Implementation: `walk_body()` in
|
||||
`minimateplus/waveform_codec.py`.
|
||||
- **Tran channel segment 0** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
|
||||
- **Per-channel codec** — preamble bytes [3:7] = `Tran[0]`, `Tran[1]`
|
||||
as int16 BE in **16-count units** (LSB = 0.005 in/s). Then `10 NN`
|
||||
(4-bit nibble deltas), `20 NN` (int8 deltas), and `00 NN` (RLE zero
|
||||
deltas) carry Tran deltas from sample 2 onward. Verified byte-perfect
|
||||
across 4 of 5 fixture events (510 samples each). Implementation:
|
||||
`decode_tran_initial()`.
|
||||
- **Segment header** — `40 02` is a 20-byte block. Payload bytes [0:2]
|
||||
are the T_delta at the start of the new segment (int16 BE). Bytes
|
||||
[6:8] are the byte length to the next segment header. Bytes [8:12]
|
||||
are a monotonic uint32 LE counter. Bytes [12:14] are constant `02 00`.
|
||||
deltas) carry per-channel deltas from sample 2 onward.
|
||||
- **Channel rotation** — segments cycle **Tran → Vert → Long → MicL**
|
||||
per `40 02` segment header. Each segment carries ~512 sample-sets of
|
||||
ONE channel. The initial body (before the first `40 02`) is the
|
||||
implicit Tran segment.
|
||||
- **Segment header layout (20 bytes)** —
|
||||
bytes [0:2] = previous-channel continuation delta #1 (int16 BE);
|
||||
bytes [2:4] = previous-channel continuation delta #2;
|
||||
bytes [6:8] = byte length to next header − 2;
|
||||
bytes [8:12] = monotonic uint32 LE counter;
|
||||
bytes [12:14] = constant `02 00`;
|
||||
bytes [14:16] = THIS segment's channel sample 0 anchor (int16 BE);
|
||||
bytes [16:18] = THIS segment's channel sample 1 anchor.
|
||||
- **`decode_waveform_v2()`** returns full per-channel sample dicts.
|
||||
Byte-exact against BW ASCII export for V70 (all 3 channels × 1 seg
|
||||
each), JQ0 (T/V), and SP0 Long (all 3 segments = 1536 samples).
|
||||
|
||||
### What's NOT solved
|
||||
|
||||
- **Tran past segment 0** — multi-segment Tran continuation has been
|
||||
attempted but every hypothesis tested breaks at sample ~512. Likely
|
||||
channels rotate across segments (e.g. segment 0 = Tran, segment 1 = Vert,
|
||||
…) but this is unverified.
|
||||
- **Vert / Long / Mic channels** — no per-channel decoder yet. These
|
||||
almost certainly live in later segments but the segment-to-channel
|
||||
mapping is open.
|
||||
- **The `30 NN` block content** — appears in loud-from-start events
|
||||
(SS0, SV0) and breaks the simple Tran walk there. Probably a channel-
|
||||
switch or alternative-encoding marker for high-amplitude regions.
|
||||
- **The `30 NN` block content** — these blocks appear in high-amplitude
|
||||
regions where sample-set deltas exceed what int8 in `20 NN` can
|
||||
express. Probably a packed multi-byte delta format. Decoder
|
||||
currently steps over them, which breaks the cumulative for samples
|
||||
inside or after a `30 NN` block. See
|
||||
`docs/waveform_codec_re_status.md` for the analysis so far.
|
||||
- **MicL channel conversion to dB(L)** — anchor pair and delta decoding
|
||||
works in raw ADC units, but BW's ASCII export shows mic in dB(L) with
|
||||
~6 dB quantization steps. Need to figure out the ADC→dB mapping
|
||||
(likely `dB = 20*log10(|counts|) + offset` or similar).
|
||||
|
||||
### Next experiment
|
||||
|
||||
**Don't hero-code the full decoder.** Build a small analysis tool — a
|
||||
segment-channel scoring analyzer. For each segment of each fixture
|
||||
event, run the segment-0 Tran block-walk + RLE decode and score the
|
||||
cumulative trajectory against the BW ASCII truth for each of {Tran,
|
||||
Vert, Long, MicL} over that segment's sample range, trying different
|
||||
anchor-bytes candidates from the segment header. The winning
|
||||
(channel, anchor-location) combination for each segment reveals
|
||||
whether segments rotate channels and which header bytes encode the
|
||||
per-segment channel anchors.
|
||||
The segment-channel scoring analyzer already ran and confirmed the
|
||||
channel-rotation hypothesis. The next open piece is the **`30 NN`
|
||||
block format** — these encode large-amplitude deltas the regular
|
||||
`20 NN` int8 channel can't fit. Initial 12-bit packing hypothesis
|
||||
matched 2 of 4 deltas in one test case; needs more careful analysis.
|
||||
|
||||
See `docs/waveform_codec_re_status.md` for the full specification of
|
||||
the next experiment.
|
||||
See `docs/waveform_codec_re_status.md` for the data and current
|
||||
guesses.
|
||||
|
||||
### Production-code status
|
||||
|
||||
|
||||
Reference in New Issue
Block a user